Concept in probability theory
Hypoexponential Parameters
λ λ -->
1
,
… … -->
,
λ λ -->
k
>
0
{\displaystyle \lambda _{1},\dots ,\lambda _{k}>0\,}
rates (real ) Support
x
∈ ∈ -->
[
0
;
∞ ∞ -->
)
{\displaystyle x\in [0;\infty )\!}
PDF
Expressed as a phase-type distribution
− − -->
α α -->
e
x
Θ Θ -->
Θ Θ -->
1
{\displaystyle -{\boldsymbol {\alpha }}e^{x\Theta }\Theta {\boldsymbol {1}}}
Has no other simple form; see article for details CDF
Expressed as a phase-type distribution
1
− − -->
α α -->
e
x
Θ Θ -->
1
{\displaystyle 1-{\boldsymbol {\alpha }}e^{x\Theta }{\boldsymbol {1}}}
Mean
∑ ∑ -->
i
=
1
k
1
/
λ λ -->
i
{\displaystyle \sum _{i=1}^{k}1/\lambda _{i}\,}
Median
General closed form does not exist[ 1] Mode
(
k
− − -->
1
)
/
λ λ -->
{\displaystyle (k-1)/\lambda }
if
λ λ -->
k
=
λ λ -->
{\displaystyle \lambda _{k}=\lambda }
, for all k Variance
∑ ∑ -->
i
=
1
k
1
/
λ λ -->
i
2
{\displaystyle \sum _{i=1}^{k}1/\lambda _{i}^{2}}
Skewness
2
(
∑ ∑ -->
i
=
1
k
1
/
λ λ -->
i
3
)
/
(
∑ ∑ -->
i
=
1
k
1
/
λ λ -->
i
2
)
3
/
2
{\displaystyle 2(\sum _{i=1}^{k}1/\lambda _{i}^{3})/(\sum _{i=1}^{k}1/\lambda _{i}^{2})^{3/2}}
Excess kurtosis
no simple closed form MGF
α α -->
(
t
I
− − -->
Θ Θ -->
)
− − -->
1
Θ Θ -->
1
{\displaystyle {\boldsymbol {\alpha }}(tI-\Theta )^{-1}\Theta \mathbf {1} }
CF
α α -->
(
i
t
I
− − -->
Θ Θ -->
)
− − -->
1
Θ Θ -->
1
{\displaystyle {\boldsymbol {\alpha }}(itI-\Theta )^{-1}\Theta \mathbf {1} }
In probability theory the hypoexponential distribution or the generalized Erlang distribution is a continuous distribution , that has found use in the same fields as the Erlang distribution, such as queueing theory , teletraffic engineering and more generally in stochastic processes . It is called the hypoexponetial distribution as it has a coefficient of variation less than one, compared to the hyper-exponential distribution which has coefficient of variation greater than one and the exponential distribution which has coefficient of variation of one.
Overview
The Erlang distribution is a series of k exponential distributions all with rate
λ λ -->
{\displaystyle \lambda }
. The hypoexponential is a series of k exponential distributions each with their own rate
λ λ -->
i
{\displaystyle \lambda _{i}}
, the rate of the
i
t
h
{\displaystyle i^{th}}
exponential distribution. If we have k independently distributed exponential random variables
X
i
{\displaystyle {\boldsymbol {X}}_{i}}
, then the random variable,
X
=
∑ ∑ -->
i
=
1
k
X
i
{\displaystyle {\boldsymbol {X}}=\sum _{i=1}^{k}{\boldsymbol {X}}_{i}}
is hypoexponentially distributed. The hypoexponential has a minimum coefficient of variation of
1
/
k
{\displaystyle 1/k}
.
Relation to the phase-type distribution
As a result of the definition it is easier to consider this distribution as a special case of the phase-type distribution .[ 2] The phase-type distribution is the time to absorption of a finite state Markov process . If we have a k+1 state process, where the first k states are transient and the state k+1 is an absorbing state, then the distribution of time from the start of the process until the absorbing state is reached is phase-type distributed. This becomes the hypoexponential if we start in the first 1 and move skip-free from state i to i+1 with rate
λ λ -->
i
{\displaystyle \lambda _{i}}
until state k transitions with rate
λ λ -->
k
{\displaystyle \lambda _{k}}
to the absorbing state k+1 . This can be written in the form of a subgenerator matrix,
[
− − -->
λ λ -->
1
λ λ -->
1
0
… … -->
0
0
0
− − -->
λ λ -->
2
λ λ -->
2
⋱ ⋱ -->
0
0
⋮ ⋮ -->
⋱ ⋱ -->
⋱ ⋱ -->
⋱ ⋱ -->
⋱ ⋱ -->
⋮ ⋮ -->
0
0
⋱ ⋱ -->
− − -->
λ λ -->
k
− − -->
2
λ λ -->
k
− − -->
2
0
0
0
… … -->
0
− − -->
λ λ -->
k
− − -->
1
λ λ -->
k
− − -->
1
0
0
… … -->
0
0
− − -->
λ λ -->
k
]
.
{\displaystyle \left[{\begin{matrix}-\lambda _{1}&\lambda _{1}&0&\dots &0&0\\0&-\lambda _{2}&\lambda _{2}&\ddots &0&0\\\vdots &\ddots &\ddots &\ddots &\ddots &\vdots \\0&0&\ddots &-\lambda _{k-2}&\lambda _{k-2}&0\\0&0&\dots &0&-\lambda _{k-1}&\lambda _{k-1}\\0&0&\dots &0&0&-\lambda _{k}\end{matrix}}\right]\;.}
For simplicity denote the above matrix
Θ Θ -->
≡ ≡ -->
Θ Θ -->
(
λ λ -->
1
,
… … -->
,
λ λ -->
k
)
{\displaystyle \Theta \equiv \Theta (\lambda _{1},\dots ,\lambda _{k})}
. If the probability of starting in each of the k states is
α α -->
=
(
1
,
0
,
… … -->
,
0
)
{\displaystyle {\boldsymbol {\alpha }}=(1,0,\dots ,0)}
then
H
y
p
o
(
λ λ -->
1
,
… … -->
,
λ λ -->
k
)
=
P
H
(
α α -->
,
Θ Θ -->
)
.
{\displaystyle Hypo(\lambda _{1},\dots ,\lambda _{k})=PH({\boldsymbol {\alpha }},\Theta ).}
Two parameter case
Where the distribution has two parameters (
λ λ -->
1
≠ ≠ -->
λ λ -->
2
{\displaystyle \lambda _{1}\neq \lambda _{2}}
) the explicit forms of the probability functions and the associated statistics are:[ 3]
CDF:
F
(
x
)
=
1
− − -->
λ λ -->
2
λ λ -->
2
− − -->
λ λ -->
1
e
− − -->
λ λ -->
1
x
− − -->
λ λ -->
1
λ λ -->
1
− − -->
λ λ -->
2
e
− − -->
λ λ -->
2
x
{\displaystyle F(x)=1-{\frac {\lambda _{2}}{\lambda _{2}-\lambda _{1}}}e^{-\lambda _{1}x}-{\frac {\lambda _{1}}{\lambda _{1}-\lambda _{2}}}e^{-\lambda _{2}x}}
PDF:
f
(
x
)
=
λ λ -->
1
λ λ -->
2
λ λ -->
1
− − -->
λ λ -->
2
(
e
− − -->
x
λ λ -->
2
− − -->
e
− − -->
x
λ λ -->
1
)
{\displaystyle f(x)={\frac {\lambda _{1}\lambda _{2}}{\lambda _{1}-\lambda _{2}}}(e^{-x\lambda _{2}}-e^{-x\lambda _{1}})}
Mean:
1
λ λ -->
1
+
1
λ λ -->
2
{\displaystyle {\frac {1}{\lambda _{1}}}+{\frac {1}{\lambda _{2}}}}
Variance:
1
λ λ -->
1
2
+
1
λ λ -->
2
2
{\displaystyle {\frac {1}{\lambda _{1}^{2}}}+{\frac {1}{\lambda _{2}^{2}}}}
Coefficient of variation:
λ λ -->
1
2
+
λ λ -->
2
2
λ λ -->
1
+
λ λ -->
2
{\displaystyle {\frac {\sqrt {\lambda _{1}^{2}+\lambda _{2}^{2}}}{\lambda _{1}+\lambda _{2}}}}
The coefficient of variation is always less than 1.
Given the sample mean (
x
¯ ¯ -->
{\displaystyle {\bar {x}}}
) and sample coefficient of variation (
c
{\displaystyle c}
), the parameters
λ λ -->
1
{\displaystyle \lambda _{1}}
and
λ λ -->
2
{\displaystyle \lambda _{2}}
can be estimated as follows:
λ λ -->
1
=
2
x
¯ ¯ -->
[
1
+
1
+
2
(
c
2
− − -->
1
)
]
− − -->
1
{\displaystyle \lambda _{1}={\frac {2}{\bar {x}}}\left[1+{\sqrt {1+2(c^{2}-1)}}\right]^{-1}}
λ λ -->
2
=
2
x
¯ ¯ -->
[
1
− − -->
1
+
2
(
c
2
− − -->
1
)
]
− − -->
1
{\displaystyle \lambda _{2}={\frac {2}{\bar {x}}}\left[1-{\sqrt {1+2(c^{2}-1)}}\right]^{-1}}
These estimators can be derived from the methods of moments by setting
1
λ λ -->
1
+
1
λ λ -->
2
=
x
¯ ¯ -->
{\displaystyle {\frac {1}{\lambda _{1}}}+{\frac {1}{\lambda _{2}}}={\bar {x}}}
and
λ λ -->
1
2
+
λ λ -->
2
2
λ λ -->
1
+
λ λ -->
2
=
c
{\displaystyle {\frac {\sqrt {\lambda _{1}^{2}+\lambda _{2}^{2}}}{\lambda _{1}+\lambda _{2}}}=c}
.
The resulting parameters
λ λ -->
1
{\displaystyle \lambda _{1}}
and
λ λ -->
2
{\displaystyle \lambda _{2}}
are real values if
c
2
∈ ∈ -->
[
0.5
,
1
]
{\displaystyle c^{2}\in [0.5,1]}
.
Characterization
A random variable
X
∼ ∼ -->
H
y
p
o
(
λ λ -->
1
,
… … -->
,
λ λ -->
k
)
{\displaystyle {\boldsymbol {X}}\sim Hypo(\lambda _{1},\dots ,\lambda _{k})}
has cumulative distribution function given by,
F
(
x
)
=
1
− − -->
α α -->
e
x
Θ Θ -->
1
{\displaystyle F(x)=1-{\boldsymbol {\alpha }}e^{x\Theta }{\boldsymbol {1}}}
and density function ,
f
(
x
)
=
− − -->
α α -->
e
x
Θ Θ -->
Θ Θ -->
1
,
{\displaystyle f(x)=-{\boldsymbol {\alpha }}e^{x\Theta }\Theta {\boldsymbol {1}}\;,}
where
1
{\displaystyle {\boldsymbol {1}}}
is a column vector of ones of the size k and
e
A
{\displaystyle e^{A}}
is the matrix exponential of A . When
λ λ -->
i
≠ ≠ -->
λ λ -->
j
{\displaystyle \lambda _{i}\neq \lambda _{j}}
for all
i
≠ ≠ -->
j
{\displaystyle i\neq j}
, the density function can be written as
f
(
x
)
=
∑ ∑ -->
i
=
1
k
λ λ -->
i
e
− − -->
x
λ λ -->
i
(
∏ ∏ -->
j
=
1
,
j
≠ ≠ -->
i
k
λ λ -->
j
λ λ -->
j
− − -->
λ λ -->
i
)
=
∑ ∑ -->
i
=
1
k
ℓ ℓ -->
i
(
0
)
λ λ -->
i
e
− − -->
x
λ λ -->
i
{\displaystyle f(x)=\sum _{i=1}^{k}\lambda _{i}e^{-x\lambda _{i}}\left(\prod _{j=1,j\neq i}^{k}{\frac {\lambda _{j}}{\lambda _{j}-\lambda _{i}}}\right)=\sum _{i=1}^{k}\ell _{i}(0)\lambda _{i}e^{-x\lambda _{i}}}
where
ℓ ℓ -->
1
(
x
)
,
… … -->
,
ℓ ℓ -->
k
(
x
)
{\displaystyle \ell _{1}(x),\dots ,\ell _{k}(x)}
are the Lagrange basis polynomials associated with the points
λ λ -->
1
,
… … -->
,
λ λ -->
k
{\displaystyle \lambda _{1},\dots ,\lambda _{k}}
.
The distribution has Laplace transform of
L
{
f
(
x
)
}
=
− − -->
α α -->
(
s
I
− − -->
Θ Θ -->
)
− − -->
1
Θ Θ -->
1
{\displaystyle {\mathcal {L}}\{f(x)\}=-{\boldsymbol {\alpha }}(sI-\Theta )^{-1}\Theta {\boldsymbol {1}}}
Which can be used to find moments,
E
[
X
n
]
=
(
− − -->
1
)
n
n
!
α α -->
Θ Θ -->
− − -->
n
1
.
{\displaystyle E[X^{n}]=(-1)^{n}n!{\boldsymbol {\alpha }}\Theta ^{-n}{\boldsymbol {1}}\;.}
General case
In the general case
where there are
a
{\displaystyle a}
distinct sums of exponential distributions
with rates
λ λ -->
1
,
λ λ -->
2
,
⋯ ⋯ -->
,
λ λ -->
a
{\displaystyle \lambda _{1},\lambda _{2},\cdots ,\lambda _{a}}
and a number of terms in each
sum equals to
r
1
,
r
2
,
⋯ ⋯ -->
,
r
a
{\displaystyle r_{1},r_{2},\cdots ,r_{a}}
respectively. The cumulative
distribution function for
t
≥ ≥ -->
0
{\displaystyle t\geq 0}
is given by
F
(
t
)
=
1
− − -->
(
∏ ∏ -->
j
=
1
a
λ λ -->
j
r
j
)
∑ ∑ -->
k
=
1
a
∑ ∑ -->
l
=
1
r
k
Ψ Ψ -->
k
,
l
(
− − -->
λ λ -->
k
)
t
r
k
− − -->
l
exp
-->
(
− − -->
λ λ -->
k
t
)
(
r
k
− − -->
l
)
!
(
l
− − -->
1
)
!
,
{\displaystyle F(t)=1-\left(\prod _{j=1}^{a}\lambda _{j}^{r_{j}}\right)\sum _{k=1}^{a}\sum _{l=1}^{r_{k}}{\frac {\Psi _{k,l}(-\lambda _{k})t^{r_{k}-l}\exp(-\lambda _{k}t)}{(r_{k}-l)!(l-1)!}},}
with
Ψ Ψ -->
k
,
l
(
x
)
=
− − -->
∂ ∂ -->
l
− − -->
1
∂ ∂ -->
x
l
− − -->
1
(
∏ ∏ -->
j
=
0
,
j
≠ ≠ -->
k
a
(
λ λ -->
j
+
x
)
− − -->
r
j
)
.
{\displaystyle \Psi _{k,l}(x)=-{\frac {\partial ^{l-1}}{\partial x^{l-1}}}\left(\prod _{j=0,j\neq k}^{a}\left(\lambda _{j}+x\right)^{-r_{j}}\right).}
with the additional convention
λ λ -->
0
=
0
,
r
0
=
1
{\displaystyle \lambda _{0}=0,r_{0}=1}
.[ 4]
Uses
This distribution has been used in population genetics,[ 5] cell biology,[ 6] [ 7] and queuing theory.[ 8] [ 9]
See also
References
^ "HypoexponentialDistribution" . Wolfram Language & System Documentation Center . Wolfram. 2012. Retrieved 27 February 2024 .
^ Legros, Benjamin; Jouini, Oualid (2015). "A linear algebraic approach for the computation of sums of Erlang random variables" . Applied Mathematical Modelling . 39 (16): 4971– 4977. doi :10.1016/j.apm.2015.04.013 .
^ Bolch, Gunter; Greiner, Stefan; de Meer, Hermann; Trivedi, Kishor S. (2006). Queuing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications (2nd ed.). Wiley. pp. 24– 25. doi :10.1002/0471791571 . ISBN 978-0-471-79157-7 .
^ Amari, Suprasad V.; Misra, Ravindra B. (1997). "Closed-form expressions for distribution of sum of exponential random variables". IEEE Transactions on Reliability . 46 (4): 519– 522. doi :10.1109/24.693785 .
^ Strimmer, Korbinian; Pybus, Oliver G. (2001). "Exploring the demographic history of DNA sequences using the generalized skyline plot" . Molecular Biology and Evolution . 18 (12): 2298– 2305. doi :10.1093/oxfordjournals.molbev.a003776 . PMID 11719579 .
^ Yates, Christian A.; Ford, Matthew J.; Mort, Richard L. (2017). "A multi-stage representation of cell proliferation as a Markov process" . Bulletin of Mathematical Biology . 79 (12): 2905– 2928. arXiv :1705.09718 . doi :10.1007/s11538-017-0356-4 . PMC 5709504 . PMID 29030804 .
^ Gavagnin, Enrico; Ford, Matthew J.; Mort, Richard L.; Rogers, Tim; Yates, Christian A. (2019). "The invasion speed of cell migration models with realistic cell cycle time distributions". Journal of Theoretical Biology . 481 : 91– 99. arXiv :1806.03140 . doi :10.1016/j.jtbi.2018.09.010 . PMID 30219568 .
^ Călinescu, Malenia (August 2009). "Forecasting and capacity planning for ambulance services" (PDF) . Faculty of Sciences . Vrije Universiteit Amsterdam . Archived from the original (PDF) on 15 February 2010.
^ Bekker, René; Koeleman, Paulien M. (2011). "Scheduling admissions and reducing variability in bed demand" . Health Care Management Science . 14 (3): 237– 249. doi :10.1007/s10729-011-9163-x . PMC 3158339 . PMID 21667090 .
Further reading
M. F. Neuts . (1981) Matrix-Geometric Solutions in Stochastic Models: an Algorthmic Approach, Chapter 2: Probability Distributions of Phase Type; Dover Publications Inc.
G. Latouche, V. Ramaswami. (1999) Introduction to Matrix Analytic Methods in Stochastic Modelling, 1st edition. Chapter 2: PH Distributions; ASA SIAM,
Colm A. O'Cinneide (1999). Phase-type distribution: open problems and a few properties , Communication in Statistic - Stochastic Models, 15(4), 731–757.
L. Leemis and J. McQueston (2008). Univariate distribution relationships , The American Statistician, 62(1), 45—53.
S. Ross. (2007) Introduction to Probability Models, 9th edition, New York: Academic Press
Discrete univariate
with finite support with infinite support
Continuous univariate
supported on a bounded interval supported on a semi-infinite interval supported on the whole real line with support whose type varies
Mixed univariate
Multivariate (joint) Directional Degenerate and singular Families