How random are a learner's mistakes?

21 March 2009

Abstract

Given a random binary sequence $X^{(n)}$ of random variables, $X_{t},$ $t=1,2,...,n$ , for instance, one that is generated by a Markov source (teacher) of order $k^{*}$ (each state represented by $k^{*}$ bits). Assume that the probability of the event $X_{t}=1$ is constant and denote it by $\beta$ . Consider a learner which is based on a parametric model, for instance a Markov model of order $k$ , who trains on a sequence $x^{(m)}$ which is randomly drawn by the teacher. Test the learner's performance by giving it a sequence $x^{(n)}$ (generated by the teacher) and check its predictions on every bit of $x^{(n)}.$ An error occurs at time $t$ if the learner's prediction $Y_{t}$ differs from the true bit value $X_{t}$ . Denote by $\xi^{(n)}$ the sequence of errors where the error bit $\xi_{t}$ at time $t$ equals 1 or 0 according to whether the event of an error occurs or not, respectively. Consider the subsequence $\xi^{(\nu)}$ of $\xi^{(n)}$ which corresponds to the errors of predicting a 0, i.e., $\xi^{(\nu)}$ consists of the bits of $\xi^{(n)}$ only at times $t$ such that $Y_{t}=0.$ In this paper we compute an estimate on the deviation of the frequency of 1s of $\xi^{(\nu)}$ from $\beta$ . The result shows that the level of randomness of $\xi^{(\nu)}$ decreases relative to an increase in the complexity of the learner.

View on arXiv

Comments on this paper