Trimean

In statistics the trimean (TM), or Tukey's trimean, is a measure of a probability distribution's location defined as a weighted average of the distribution's median and its two quartiles:

This is equivalent to the average of the median and the midhinge:

The foundations of the trimean were part of Arthur Bowley's teachings, and later popularized by statistician John Tukey in his 1977 book[1] which has given its name to a set of techniques called exploratory data analysis.

Like the median and the midhinge, but unlike the sample mean, it is a statistically resistant L-estimator with a breakdown point of 25%. This beneficial property has been described as follows:

An advantage of the trimean as a measure of the center (of a distribution) is that it combines the median's emphasis on center values with the midhinge's attention to the extremes.

— Herbert F. Weisberg, Central Tendency and Variability[2]

Efficiency

Despite its simplicity, the trimean is a remarkably efficient estimator of population mean. More precisely, for a large data set (over 100 points[3]) from a symmetric population, the average of the 18th, 50th, and 82nd percentile is the most efficient 3-point L-estimator, with 88% efficiency.[4] For context, the best single point estimate by L-estimators is the median, with an efficiency of 64% or better (for all n), while using two points (for a large data set of over 100 points from a symmetric population), the most efficient estimate is the 27% midsummary (mean of 27th and 73rd percentiles), which has an efficiency of about 81%. Using quartiles, these optimal estimators can be approximated by the midhinge and the trimean. Using further points yield higher efficiency, though it is notable that only three points are needed for very high efficiency.

See also

References

  1. ^ Tukey, John Wilder (1977). Exploratory Data Analysis. Addison-Wesley. ISBN 0-201-07616-0.
  2. ^ Weisberg, H. F. (1992). Central Tendency and Variability. Sage University. ISBN 0-8039-4007-6 (p. 39)
  3. ^ Evans 1955, Appendix G: Inefficient statistics, pp. 902–904.
  4. ^ Mosteller, Frederick (December 1946). "On Some Useful "Inefficient" Statistics". Annals of Mathematical Statistics. 17 (4). Tables I and II: 377–408. doi:10.1214/aoms/1177730881. Retrieved 18 May 2024.