V-statistics are a class of statistics named for Richard von Mises who developed their asymptotic distribution theory in a fundamental paper in 1947.[1] V-statistics are closely related to U-statistics[2][3] (U for "unbiased") introduced by Wassily Hoeffding in 1948.[4] A V-statistic is a statistical function (of a sample) defined by a particular statistical functional of a probability distribution.
Statistics that can be represented as functionals T ( F n ) {\displaystyle T(F_{n})} of the empirical distribution function ( F n ) {\displaystyle (F_{n})} are called statistical functionals.[5] Differentiability of the functional T plays a key role in the von Mises approach; thus von Mises considers differentiable statistical functionals.[1]
Suppose x1, ..., xn is a sample. In typical applications the statistical function has a representation as the V-statistic
where h is a symmetric kernel function. Serfling[6] discusses how to find the kernel in practice. Vmn is called a V-statistic of degree m.
A symmetric kernel of degree 2 is a function h(x, y), such that h(x, y) = h(y, x) for all x and y in the domain of h. For samples x1, ..., xn, the corresponding V-statistic is defined
In examples 1–3, the asymptotic distribution of the statistic is different: in (1) it is normal, in (2) it is chi-squared, and in (3) it is a weighted sum of chi-squared variables.
Von Mises' approach is a unifying theory that covers all of the cases above.[1] Informally, the type of asymptotic distribution of a statistical function depends on the order of "degeneracy," which is determined by which term is the first non-vanishing term in the Taylor expansion of the functional T. In case it is the linear term, the limit distribution is normal; otherwise higher order types of distributions arise (under suitable conditions such that a central limit theorem holds).
There are a hierarchy of cases parallel to asymptotic theory of U-statistics.[7] Let A(m) be the property defined by:
Case m = 1 (Non-degenerate kernel):
If A(1) is true, the statistic is a sample mean and the Central Limit Theorem implies that T(Fn) is asymptotically normal.
In the variance example (4), m2 is asymptotically normal with mean σ σ --> 2 {\displaystyle \sigma ^{2}} and variance ( μ μ --> 4 − − --> σ σ --> 4 ) / n {\displaystyle (\mu _{4}-\sigma ^{4})/n} , where μ μ --> 4 = E ( X − − --> E ( X ) ) 4 {\displaystyle \mu _{4}=E(X-E(X))^{4}} .
Case m = 2 (Degenerate kernel):
Suppose A(2) is true, and E [ h 2 ( X 1 , X 2 ) ] < ∞ ∞ --> , E | h ( X 1 , X 1 ) | < ∞ ∞ --> , {\displaystyle E[h^{2}(X_{1},X_{2})]<\infty ,\,E|h(X_{1},X_{1})|<\infty ,} and E [ h ( x , X 1 ) ] ≡ ≡ --> 0 {\displaystyle E[h(x,X_{1})]\equiv 0} . Then nV2,n converges in distribution to a weighted sum of independent chi-squared variables:
where Z k {\displaystyle Z_{k}} are independent standard normal variables and λ λ --> k {\displaystyle \lambda _{k}} are constants that depend on the distribution F and the functional T. In this case the asymptotic distribution is called a quadratic form of centered Gaussian random variables. The statistic V2,n is called a degenerate kernel V-statistic. The V-statistic associated with the Cramer–von Mises functional[1] (Example 3) is an example of a degenerate kernel V-statistic.[8]
Lokasi Pengunjung: 3.144.253.21