Matrix t Notation
T
n
,
p
(
ν ν -->
,
M
,
Σ Σ -->
,
Ω Ω -->
)
{\displaystyle {\rm {T}}_{n,p}(\nu ,\mathbf {M} ,{\boldsymbol {\Sigma }},{\boldsymbol {\Omega }})}
Parameters
M
{\displaystyle \mathbf {M} }
location (real
n
× × -->
p
{\displaystyle n\times p}
matrix )
Ω Ω -->
{\displaystyle {\boldsymbol {\Omega }}}
scale (positive-definite real
p
× × -->
p
{\displaystyle p\times p}
matrix )
Σ Σ -->
{\displaystyle {\boldsymbol {\Sigma }}}
scale (positive-definite real
n
× × -->
n
{\displaystyle n\times n}
matrix )
ν ν -->
>
0
{\displaystyle \nu >0}
degrees of freedom (real) Support
X
∈ ∈ -->
R
n
× × -->
p
{\displaystyle \mathbf {X} \in \mathbb {R} ^{n\times p}}
PDF
Γ Γ -->
p
(
ν ν -->
+
n
+
p
− − -->
1
2
)
(
π π -->
)
n
p
2
Γ Γ -->
p
(
ν ν -->
+
p
− − -->
1
2
)
|
Ω Ω -->
|
− − -->
n
2
|
Σ Σ -->
|
− − -->
p
2
{\displaystyle {\frac {\Gamma _{p}\left({\frac {\nu +n+p-1}{2}}\right)}{(\pi )^{\frac {np}{2}}\Gamma _{p}\left({\frac {\nu +p-1}{2}}\right)}}|{\boldsymbol {\Omega }}|^{-{\frac {n}{2}}}|{\boldsymbol {\Sigma }}|^{-{\frac {p}{2}}}}
× × -->
|
I
n
+
Σ Σ -->
− − -->
1
(
X
− − -->
M
)
Ω Ω -->
− − -->
1
(
X
− − -->
M
)
T
|
− − -->
ν ν -->
+
n
+
p
− − -->
1
2
{\displaystyle \times \left|\mathbf {I} _{n}+{\boldsymbol {\Sigma }}^{-1}(\mathbf {X} -\mathbf {M} ){\boldsymbol {\Omega }}^{-1}(\mathbf {X} -\mathbf {M} )^{\rm {T}}\right|^{-{\frac {\nu +n+p-1}{2}}}}
CDF
No analytic expression Mean
M
{\displaystyle \mathbf {M} }
if
ν ν -->
>
1
{\displaystyle \nu >1}
, else undefined Mode
M
{\displaystyle \mathbf {M} }
Variance
c
o
v
(
v
e
c
(
X
)
)
=
Σ Σ -->
⊗ ⊗ -->
Ω Ω -->
ν ν -->
− − -->
2
{\displaystyle \mathrm {cov} (\mathrm {vec} (\mathbf {X} ))={\frac {{\boldsymbol {\Sigma }}\otimes {\boldsymbol {\Omega }}}{\nu -2}}}
if
ν ν -->
>
2
{\displaystyle \nu >2}
, else undefined CF
see below
In statistics , the matrix t -distribution (or matrix variate t -distribution ) is the generalization of the multivariate t -distribution from vectors to matrices .[ 1] [ 2]
The matrix t -distribution shares the same relationship with the multivariate t -distribution that the matrix normal distribution shares with the multivariate normal distribution : If the matrix has only one row, or only one column, the distributions become equivalent to the corresponding (vector-)multivariate distribution. The matrix t -distribution is the compound distribution that results from an infinite mixture of a matrix normal distribution with an inverse Wishart distribution placed over either of its covariance matrices,[ 1] and the multivariate t -distribution can be generated in a similar way.[ 2]
In a Bayesian analysis of a multivariate linear regression model based on the matrix normal distribution, the matrix t -distribution is the posterior predictive distribution .[ 3]
Definition
For a matrix t -distribution, the probability density function at the point
X
{\displaystyle \mathbf {X} }
of an
n
× × -->
p
{\displaystyle n\times p}
space is
f
(
X
;
ν ν -->
,
M
,
Σ Σ -->
,
Ω Ω -->
)
=
K
× × -->
|
I
n
+
Σ Σ -->
− − -->
1
(
X
− − -->
M
)
Ω Ω -->
− − -->
1
(
X
− − -->
M
)
T
|
− − -->
ν ν -->
+
n
+
p
− − -->
1
2
,
{\displaystyle f(\mathbf {X} ;\nu ,\mathbf {M} ,{\boldsymbol {\Sigma }},{\boldsymbol {\Omega }})=K\times \left|\mathbf {I} _{n}+{\boldsymbol {\Sigma }}^{-1}(\mathbf {X} -\mathbf {M} ){\boldsymbol {\Omega }}^{-1}(\mathbf {X} -\mathbf {M} )^{\rm {T}}\right|^{-{\frac {\nu +n+p-1}{2}}},}
where the constant of integration K is given by
K
=
Γ Γ -->
p
(
ν ν -->
+
n
+
p
− − -->
1
2
)
(
π π -->
)
n
p
2
Γ Γ -->
p
(
ν ν -->
+
p
− − -->
1
2
)
|
Ω Ω -->
|
− − -->
n
2
|
Σ Σ -->
|
− − -->
p
2
.
{\displaystyle K={\frac {\Gamma _{p}\left({\frac {\nu +n+p-1}{2}}\right)}{(\pi )^{\frac {np}{2}}\Gamma _{p}\left({\frac {\nu +p-1}{2}}\right)}}|{\boldsymbol {\Omega }}|^{-{\frac {n}{2}}}|{\boldsymbol {\Sigma }}|^{-{\frac {p}{2}}}.}
Here
Γ Γ -->
p
{\displaystyle \Gamma _{p}}
is the multivariate gamma function .
Properties
If
X
∼ ∼ -->
T
n
× × -->
p
(
ν ν -->
,
M
,
Σ Σ -->
,
Ω Ω -->
)
{\displaystyle \mathbf {X} \sim {\mathcal {T}}_{n\times p}(\nu ,\mathbf {M} ,\mathbf {\Sigma } ,\mathbf {\Omega } )}
, then we have the following properties[ 2] :
Expected values
The mean, or expected value is, if
ν ν -->
>
1
{\displaystyle \nu >1}
:
E
[
X
]
=
M
{\displaystyle E[\mathbf {X} ]=\mathbf {M} }
and we have the following second-order expectations, if
ν ν -->
>
2
{\displaystyle \nu >2}
:
E
[
(
X
− − -->
M
)
(
X
− − -->
M
)
T
]
=
Σ Σ -->
tr
-->
(
Ω Ω -->
)
ν ν -->
− − -->
2
{\displaystyle E[(\mathbf {X} -\mathbf {M} )(\mathbf {X} -\mathbf {M} )^{T}]={\frac {\mathbf {\Sigma } \operatorname {tr} (\mathbf {\Omega } )}{\nu -2}}}
E
[
(
X
− − -->
M
)
T
(
X
− − -->
M
)
]
=
Ω Ω -->
tr
-->
(
Σ Σ -->
)
ν ν -->
− − -->
2
{\displaystyle E[(\mathbf {X} -\mathbf {M} )^{T}(\mathbf {X} -\mathbf {M} )]={\frac {\mathbf {\Omega } \operatorname {tr} (\mathbf {\Sigma } )}{\nu -2}}}
where
tr
{\displaystyle \operatorname {tr} }
denotes trace .
More generally, for appropriately dimensioned matrices A ,B ,C :
E
[
(
X
− − -->
M
)
A
(
X
− − -->
M
)
T
]
=
Σ Σ -->
tr
-->
(
A
T
Ω Ω -->
)
ν ν -->
− − -->
2
E
[
(
X
− − -->
M
)
T
B
(
X
− − -->
M
)
]
=
Ω Ω -->
tr
-->
(
B
T
Σ Σ -->
)
ν ν -->
− − -->
2
E
[
(
X
− − -->
M
)
C
(
X
− − -->
M
)
]
=
Σ Σ -->
C
T
Ω Ω -->
ν ν -->
− − -->
2
{\displaystyle {\begin{aligned}E[(\mathbf {X} -\mathbf {M} )\mathbf {A} (\mathbf {X} -\mathbf {M} )^{T}]&={\frac {\mathbf {\Sigma } \operatorname {tr} (\mathbf {A} ^{T}\mathbf {\Omega } )}{\nu -2}}\\E[(\mathbf {X} -\mathbf {M} )^{T}\mathbf {B} (\mathbf {X} -\mathbf {M} )]&={\frac {\mathbf {\Omega } \operatorname {tr} (\mathbf {B} ^{T}\mathbf {\Sigma } )}{\nu -2}}\\E[(\mathbf {X} -\mathbf {M} )\mathbf {C} (\mathbf {X} -\mathbf {M} )]&={\frac {\mathbf {\Sigma } \mathbf {C} ^{T}\mathbf {\Omega } }{\nu -2}}\end{aligned}}}
Transpose transform:
X
T
∼ ∼ -->
T
p
× × -->
n
(
ν ν -->
,
M
T
,
Ω Ω -->
,
Σ Σ -->
)
{\displaystyle \mathbf {X} ^{T}\sim {\mathcal {T}}_{p\times n}(\nu ,\mathbf {M} ^{T},\mathbf {\Omega } ,\mathbf {\Sigma } )}
Linear transform: let A (r -by-n ), be of full rank r ≤ n and B (p -by-s ), be of full rank s ≤ p , then:
A
X
B
∼ ∼ -->
T
r
× × -->
s
(
ν ν -->
,
A
M
B
,
A
Σ Σ -->
A
T
,
B
T
Ω Ω -->
B
)
{\displaystyle \mathbf {AXB} \sim {\mathcal {T}}_{r\times s}(\nu ,\mathbf {AMB} ,\mathbf {A\Sigma A} ^{T},\mathbf {B} ^{T}\mathbf {\Omega B} )}
The characteristic function and various other properties can be derived from the re-parameterised formulation (see below).
Re-parameterized matrix t -distribution
Re-parameterized matrix t Notation
T
n
,
p
(
α α -->
,
β β -->
,
M
,
Σ Σ -->
,
Ω Ω -->
)
{\displaystyle {\rm {T}}_{n,p}(\alpha ,\beta ,\mathbf {M} ,{\boldsymbol {\Sigma }},{\boldsymbol {\Omega }})}
Parameters
M
{\displaystyle \mathbf {M} }
location (real
n
× × -->
p
{\displaystyle n\times p}
matrix )
Ω Ω -->
{\displaystyle {\boldsymbol {\Omega }}}
scale (positive-definite real
p
× × -->
p
{\displaystyle p\times p}
matrix )
Σ Σ -->
{\displaystyle {\boldsymbol {\Sigma }}}
scale (positive-definite real
n
× × -->
n
{\displaystyle n\times n}
matrix )
α α -->
>
(
p
− − -->
1
)
/
2
{\displaystyle \alpha >(p-1)/2}
shape parameter
β β -->
>
0
{\displaystyle \beta >0}
scale parameter Support
X
∈ ∈ -->
R
n
× × -->
p
{\displaystyle \mathbf {X} \in \mathbb {R} ^{n\times p}}
PDF
Γ Γ -->
p
(
α α -->
+
n
/
2
)
(
2
π π -->
/
β β -->
)
n
p
2
Γ Γ -->
p
(
α α -->
)
|
Ω Ω -->
|
− − -->
n
2
|
Σ Σ -->
|
− − -->
p
2
{\displaystyle {\frac {\Gamma _{p}(\alpha +n/2)}{(2\pi /\beta )^{\frac {np}{2}}\Gamma _{p}(\alpha )}}|{\boldsymbol {\Omega }}|^{-{\frac {n}{2}}}|{\boldsymbol {\Sigma }}|^{-{\frac {p}{2}}}}
× × -->
|
I
n
+
β β -->
2
Σ Σ -->
− − -->
1
(
X
− − -->
M
)
Ω Ω -->
− − -->
1
(
X
− − -->
M
)
T
|
− − -->
(
α α -->
+
n
/
2
)
{\displaystyle \times \left|\mathbf {I} _{n}+{\frac {\beta }{2}}{\boldsymbol {\Sigma }}^{-1}(\mathbf {X} -\mathbf {M} ){\boldsymbol {\Omega }}^{-1}(\mathbf {X} -\mathbf {M} )^{\rm {T}}\right|^{-(\alpha +n/2)}}
CDF
No analytic expression Mean
M
{\displaystyle \mathbf {M} }
if
α α -->
>
p
/
2
{\displaystyle \alpha >p/2}
, else undefined Variance
2
(
Σ Σ -->
⊗ ⊗ -->
Ω Ω -->
)
β β -->
(
2
α α -->
− − -->
p
− − -->
1
)
{\displaystyle {\frac {2({\boldsymbol {\Sigma }}\otimes {\boldsymbol {\Omega }})}{\beta (2\alpha -p-1)}}}
if
α α -->
>
(
p
+
1
)
/
2
{\displaystyle \alpha >(p+1)/2}
, else undefined CF
see below
An alternative parameterisation of the matrix t -distribution uses two parameters
α α -->
{\displaystyle \alpha }
and
β β -->
{\displaystyle \beta }
in place of
ν ν -->
{\displaystyle \nu }
.[ 3]
This formulation reduces to the standard matrix t -distribution with
β β -->
=
2
,
α α -->
=
ν ν -->
+
p
− − -->
1
2
.
{\displaystyle \beta =2,\alpha ={\frac {\nu +p-1}{2}}.}
This formulation of the matrix t -distribution can be derived as the compound distribution that results from an infinite mixture of a matrix normal distribution with an inverse multivariate gamma distribution placed over either of its covariance matrices.
Properties
If
X
∼ ∼ -->
T
n
,
p
(
α α -->
,
β β -->
,
M
,
Σ Σ -->
,
Ω Ω -->
)
{\displaystyle \mathbf {X} \sim {\rm {T}}_{n,p}(\alpha ,\beta ,\mathbf {M} ,{\boldsymbol {\Sigma }},{\boldsymbol {\Omega }})}
then[ 2] [ 3]
X
T
∼ ∼ -->
T
p
,
n
(
α α -->
,
β β -->
,
M
T
,
Ω Ω -->
,
Σ Σ -->
)
.
{\displaystyle \mathbf {X} ^{\rm {T}}\sim {\rm {T}}_{p,n}(\alpha ,\beta ,\mathbf {M} ^{\rm {T}},{\boldsymbol {\Omega }},{\boldsymbol {\Sigma }}).}
The property above comes from Sylvester's determinant theorem :
det
(
I
n
+
β β -->
2
Σ Σ -->
− − -->
1
(
X
− − -->
M
)
Ω Ω -->
− − -->
1
(
X
− − -->
M
)
T
)
=
{\displaystyle \det \left(\mathbf {I} _{n}+{\frac {\beta }{2}}{\boldsymbol {\Sigma }}^{-1}(\mathbf {X} -\mathbf {M} ){\boldsymbol {\Omega }}^{-1}(\mathbf {X} -\mathbf {M} )^{\rm {T}}\right)=}
det
(
I
p
+
β β -->
2
Ω Ω -->
− − -->
1
(
X
T
− − -->
M
T
)
Σ Σ -->
− − -->
1
(
X
T
− − -->
M
T
)
T
)
.
{\displaystyle \det \left(\mathbf {I} _{p}+{\frac {\beta }{2}}{\boldsymbol {\Omega }}^{-1}(\mathbf {X} ^{\rm {T}}-\mathbf {M} ^{\rm {T}}){\boldsymbol {\Sigma }}^{-1}(\mathbf {X} ^{\rm {T}}-\mathbf {M} ^{\rm {T}})^{\rm {T}}\right).}
If
X
∼ ∼ -->
T
n
,
p
(
α α -->
,
β β -->
,
M
,
Σ Σ -->
,
Ω Ω -->
)
{\displaystyle \mathbf {X} \sim {\rm {T}}_{n,p}(\alpha ,\beta ,\mathbf {M} ,{\boldsymbol {\Sigma }},{\boldsymbol {\Omega }})}
and
A
(
n
× × -->
n
)
{\displaystyle \mathbf {A} (n\times n)}
and
B
(
p
× × -->
p
)
{\displaystyle \mathbf {B} (p\times p)}
are nonsingular matrices then[ 2] [ 3]
A
X
B
∼ ∼ -->
T
n
,
p
(
α α -->
,
β β -->
,
A
M
B
,
A
Σ Σ -->
A
T
,
B
T
Ω Ω -->
B
)
.
{\displaystyle \mathbf {AXB} \sim {\rm {T}}_{n,p}(\alpha ,\beta ,\mathbf {AMB} ,\mathbf {A} {\boldsymbol {\Sigma }}\mathbf {A} ^{\rm {T}},\mathbf {B} ^{\rm {T}}{\boldsymbol {\Omega }}\mathbf {B} ).}
The characteristic function is[ 3]
ϕ ϕ -->
T
(
Z
)
=
exp
-->
(
t
r
(
i
Z
′
M
)
)
|
Ω Ω -->
|
α α -->
Γ Γ -->
p
(
α α -->
)
(
2
β β -->
)
α α -->
p
|
Z
′
Σ Σ -->
Z
|
α α -->
B
α α -->
(
1
2
β β -->
Z
′
Σ Σ -->
Z
Ω Ω -->
)
,
{\displaystyle \phi _{T}(\mathbf {Z} )={\frac {\exp({\rm {tr}}(i\mathbf {Z} '\mathbf {M} ))|{\boldsymbol {\Omega }}|^{\alpha }}{\Gamma _{p}(\alpha )(2\beta )^{\alpha p}}}|\mathbf {Z} '{\boldsymbol {\Sigma }}\mathbf {Z} |^{\alpha }B_{\alpha }\left({\frac {1}{2\beta }}\mathbf {Z} '{\boldsymbol {\Sigma }}\mathbf {Z} {\boldsymbol {\Omega }}\right),}
where
B
δ δ -->
(
W
Z
)
=
|
W
|
− − -->
δ δ -->
∫ ∫ -->
S
>
0
exp
-->
(
t
r
(
− − -->
S
W
− − -->
S
− − -->
1
Z
)
)
|
S
|
− − -->
δ δ -->
− − -->
1
2
(
p
+
1
)
d
S
,
{\displaystyle B_{\delta }(\mathbf {WZ} )=|\mathbf {W} |^{-\delta }\int _{\mathbf {S} >0}\exp \left({\rm {tr}}(-\mathbf {SW} -\mathbf {S^{-1}Z} )\right)|\mathbf {S} |^{-\delta -{\frac {1}{2}}(p+1)}d\mathbf {S} ,}
and where
B
δ δ -->
{\displaystyle B_{\delta }}
is the type-two Bessel function of Herz[clarification needed ] of a matrix argument.
See also
Notes
^ a b Zhu, Shenghuo and Kai Yu and Yihong Gong (2007). "Predictive Matrix-Variate t Models." In J. C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, NIPS '07: Advances in Neural Information Processing Systems 20, pages 1721–1728. MIT Press, Cambridge, MA, 2008. The notation is changed a bit in this article for consistency with the matrix normal distribution article.
^ a b c d e Gupta, Arjun K and Nagar, Daya K (1999). Matrix variate distributions . CRC Press. pp. Chapter 4. {{cite book }}
: CS1 maint: multiple names: authors list (link )
^ a b c d e Iranmanesh, Anis, M. Arashi and S. M. M. Tabatabaey (2010). "On Conditional Applications of Matrix Variate Normal Distribution" . Iranian Journal of Mathematical Sciences and Informatics , 5:2, pp. 33–43.
External links
Discrete univariate
with finite support with infinite support
Continuous univariate
supported on a bounded interval supported on a semi-infinite interval supported on the whole real line with support whose type varies
Mixed univariate
Multivariate (joint) Directional Degenerate and singular Families