Pareto interpolation is a method of estimating the median and other properties of a population that follows a Pareto distribution. It is used in economics when analysing the distribution of incomes in a population, when one must base estimates on a relatively small random sample taken from the population.
The family of Pareto distributions is parameterized by
- a positive number κ that is the smallest value that a random variable with a Pareto distribution can take. As applied to distribution of incomes, κ is the lowest income of any person in the population; and
- a positive number θ the "Pareto index"; as this increases, the tail of the distribution gets thinner. As applied to distribution of incomes, this means that the larger the value of the Pareto index θ the smaller the proportion of incomes many times as big as the smallest incomes.
Pareto interpolation can be used when the available information includes the proportion of the sample that falls below each of two specified numbers a < b. For example, it may be observed that 45% of individuals in the sample have incomes below a = $35,000 per year, and 55% have incomes below b = $40,000 per year.
Let
- Pa = proportion of the sample that lies below a;
- Pb = proportion of the sample that lies below b.
Then the estimates of κ and θ are
and
The estimate of the median would then be
since the actual population median is
References
- U.S. Census Bureau, Memorandum on statistical techniques used in 2001 income survey (PDF). See Equation 10 on p. 24.
- Stults, Brian J, Deriving median household income. Gives a derivation of the equations for Pareto interpolation.