How random are a learner's mistakes?

Given a random binary sequence of random variables, , for instance, one that is generated by a Markov source (teacher) of order (each state represented by bits). Assume that the probability of the event is constant and denote it by . Consider a learner which is based on a parametric model, for instance a Markov model of order , who trains on a sequence which is randomly drawn by the teacher. Test the learner's performance by giving it a sequence (generated by the teacher) and check its predictions on every bit of An error occurs at time if the learner's prediction differs from the true bit value . Denote by the sequence of errors where the error bit at time equals 1 or 0 according to whether the event of an error occurs or not, respectively. Consider the subsequence of which corresponds to the errors of predicting a 0, i.e., consists of the bits of only at times such that In this paper we compute an estimate on the deviation of the frequency of 1s of from . The result shows that the level of randomness of decreases relative to an increase in the complexity of the learner.
View on arXiv