28

On the number of k-skip-n-grams

Abstract

The paper proves that the number of k-skip-n-grams for a corpus of size LL is Ln+n+kn2nkn(n1+kn1)\frac{Ln + n + k' - n^2 - nk'}{n} \cdot \binom{n-1+k'}{n-1} where k=min(Ln+1,k)k' = \min(L - n + 1, k).

View on arXiv
Comments on this paper