Small screen detected. You are viewing the mobile version of SlideWiki. If you wish to edit slides you will need to use a larger device.
For RCV1, the dashed line
log10M = 0.49 log10T + 1.64 is the best least squares fit.
Thus, M = 101.64T0.49 so k = 101.64 ≈ 44 and b = 0.49.
Good empirical fit for Reuters RCV1 !
For first 1,000,020 tokens,
law predicts 38,323 terms;
actually, 38,365 terms.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License