Zipf's Law

George Kingsley Zipf 1935 "On the psych-biology of language"

  • looked at English world usage, prob of worlds appearings
    • the rank 1, 6.18%
    • I rank 11, 1.17%
    • you rank 16, 0.83%
    • he rank 9, 1.27%
    • she rank 17, 0.7%
    • war \approx 3* peace
    • dog \approx 2* cat
  • ii-th most likely word occurs 1i\approx \frac1i times as often as most likely word.

V. Pareto "Cours d'economie polidique" 1897

  • How many people have an income x\le x

Heavy tail distribution <=> Zipf's law distribution

Suppose that the key obey Zipf's 1/i1/i Law, that is, p1,,pnp_1, \cdots, p_n with pi/p1=1/ip_i / p_1 = 1/i. What are the probabilities?

The expected number of item examined: i=1ni×(pi/i)=np1=Θ(nlogn) \sum_{i=1}^n i\times (p_i/i) = np_1 =\Theta \left( \frac{n}{\log n} \right)

With Zipf's 1/i21/i^2, we'll have Θ(logn)\Theta(\log n). With 1/i31/i^3 we will have Θ(1)\Theta(1).

Money saves time, caching saves cash.

Move-to-front

Once get requested, move this key to the front.

Keep-track-of-counts

Sort the requested by counts. This require linear extra space.

Transposition

Have 1 position forward.

Analysis

What is the asymptotic probability item jj is in front of ii.