Volume 3, Issue 6 e1144
SPECIAL ISSUE PAPER
Full Access

Some uses of orthogonal polynomials in statistical inference

Inmaculada Barranco-Chamorro

Corresponding Author

Inmaculada Barranco-Chamorro

Department of Statistics and OR, University of Seville, Seville, Spain

Correspondence Inmaculada Barranco-Chamorro, Department of Statistics and OR, University of Seville, Seville, Spain.

Email: [email protected]

Search for more papers by this author
Christos Grentzelos

Christos Grentzelos

Department of Mathematics, National Technical University of Athens, Athens, Greece

Search for more papers by this author
First published: 02 January 2021
Citations: 1

Abstract

Every random variable (rv) X (or random vector) with finite moments generates a set of orthogonal polynomials, which can be used to obtain properties related to the distribution of X. This technique has been used in statistical inference, mainly connected to the exponential family of distributions. In this paper a review of some of its more relevant uses is provided. The first one deals with properties of expansions in terms of orthogonal polynomials for the Uniformly Minimum Variance Unbiased Estimator of a given parametric function, when sampling from a distribution in the Natural Exponential Family of distributions with Quadratic Variance Function. The second one compares two relevant methods, based on expansions in Laguerre polynomials, existing in the literature to approximate the distribution of linear combinations of independent chi-square variables.

1 INTRODUCTION

Every random variable (rv) X (or random vector) with finite moments generates a set of orthogonal polynomials (OPS), which can be used to obtain properties related to the distribution of X. This technique has been used in statistical inference, mainly connected to the exponential family of distributions, as it can be seen for instance in Abbey and David,1 Morris,2, 3 López-Blázquez,4 and Barranco-Chamorro and Moreno-Rebollo.5

Other uses in more general settings can be found in Voinov and Nikulin,6 in nonregular distributions in Barranco-Chamorro et al.,7 and in Bayesian statistics in Pommeret.8

More recent results include among others: applications in correspondence analysis Beh,9 D'Ambra et al.,10 and Beh and Lombard,11 birth and death processes Guillemin and Pinchon,12 goodness-of-fit tests for parametric regression models Bar-Hen and Daudin,13 inference in the exponential distribution based on k-sample doubly type-II censored data Sanjel and Balakrishnan,14 Gibbs sampling Diaconis et al.,15 multiple correspondence analysis for ordinal-scale variables Lombardo and Beh,16, 17 reweighted smooth tests of goodness of fit De Boeck et al.,18 study of dependence between ordinal-nominal categorical variables in Lombardo et al.,19 canonical correlations for Dirichlet measures in Griffiths and Spano,20 to adjust the hyperbolic secant and logistic distributions to analyze financial asset returns in Bagnato et al.,21 a general method of calculus in Withers and Nadarajah.22

All these references illustrate the potential interest of this methodology. In this paper we focus on two specific points of univariate and one-parameter case. These are properties of the Uniformly Minimum Variance Unbiased Estimator (UMVUE) of a given parametric function, when sampling from the natural exponential family of distributions with quadratic variance function, and approximations of distributions by using OPS, specifically, in the case in which we have a linear combination of independent chi-square variables. The aim of this paper is twofold. On the one hand, to provide a guide of use of these methods in Statistical Inference. On the other hand, to popularize these techniques. The classical systems of OPS are available in the current statistical software, such as Mathematica, so these methodologies are widely applicable.

2 UNIVARIATE AND ONE-PARAMETER CASE

In this paper we will focus on the well-behaved univariate and one-parameter case. That is, let us consider X a random variable (rv) with cumulative distribution function (cdf) F θ ( x ) depending on an unknown real parameter θ Θ . The parameter space Θ is an open interval of . It is assumed that (i) { F θ ( x ) | θ Θ } admits densities { f θ ( x ) | θ Θ } with respect to a sigma-finite measure ν ( x ) , (either Lebesgue or counting measure); (ii) the moments of X are finite; (iii) there exists a system of OPS with respect to the weight function f θ ( x ) denoted as p j , θ ( x ) , j ≥ 0; and iv) for every square integrable function of X, T(x), the completeness of L2 polynomial approximation proposed in (1) in terms of p j , θ (theorems 1.7, 1.8 in Lubinsky23)
T ( x ) = j = 0 a j ( θ ) p j , θ ( x ) , ()
where the coefficients are
a j ( θ ) = T , p j , θ p j , θ 2 , θ Θ . ()
In (2), ⟨· , ·⟩ and ‖ · ‖2 denote the associated inner product and norm
T , p j , θ = T ( x ) p j , θ ( x ) f θ ( x ) d ν ( x ) , p j , θ 2 = p j , θ , p j , θ .
Since T(·) is a square integrable function, the coefficients previously introduced verify
T 2 = j = 0 a j 2 ( θ ) p j , θ 2 < . ()

That is, the expansion given in (1) is convergent in L2- sense.

Next some aims of interest related to (1) are listed.
  1. Quite often, f θ ( x ) is such that the OPS associated is well known. For instance, in the normal case we have the Hermite polynomials, for the gamma distribution the (generalized) Laguerre polynomials, for the Poisson distribution the Charlier polynomials.24
  2. To get manageable expressions for the coefficients a j ( θ ) .

Whenever that previous aims are fulfilled, expansions in terms of OPS can be used to propose approximations to T(x) and study features related to the quality of these approximation, as follows.

Based on the expansion proposed in (1)
T ( x ) = j = 0 a j ( θ ) p j , θ ( x ) .
Note that T(x) can be approximated by
j = 0 k a j ( θ ) p j , θ ( x ) . ()
The error of the approximation proposed in (4) is given by
T j = 0 k a j ( θ ) p j , θ 2 = j = k + 1 a j 2 ( θ ) p j , θ 2 . ()

Lower and upper bounds for (5) can be proposed if we have closed expressions for a j ( θ ) and p j , θ 2 .

3 NATURAL EXPONENTIAL FAMILY WITH QUADRATIC VARIANCE FUNCTION

In this section we highlight some uses of expansions in terms of OPS when sampling from distributions in the one-parameter Natural Exponential Family with Quadratic Variance Function (NEF-QVF; see Morris2, 3).

Recall that for a rv X, which belongs to the one-parameter NEF its density, with respect to a sigma-finite measure ν on the Borel subsets of , can be written as
f ( x ; θ ) = exp x θ Ψ ( θ ) , θ Θ . ()

The natural parameter space Θ is the largest open set for which exp ( x θ ) d ν ( x ) is finite, exp ( x θ ) d ν ( x ) < . It will be assumed that Θ is nonempty.

The mean and the variance of X are given by
μ = E θ [ X ] = Ψ ( θ ) V a r θ [ X ] = Ψ ( θ ) .
Since Ψ ( θ ) > 0 , it is possible to reparameterize the density given in (6) in terms of the mean μ
f ( x ; μ ) = exp { x θ ( μ ) Ψ ( θ ( μ ) ) } , μ Ω , ()
where Ω = Ψ ( Θ ) is the mean space, and the variance can be expressed as a function of μ
V a r θ [ X ] = V ( μ ) = Ψ ( ( Ψ ) 1 ( μ ) ) . ()

(8) is the variance function.

Definition 1. (NEF-QVF( μ , V ( μ ) ))NEF-QVF( μ , V ( μ ) ) , refers to NEF of distributions whose variance function is, at most, a quadratic function of the mean μ

V ( μ ) = v 0 + v 1 μ + v 2 μ 2 with v i . ()

To the NEF-QVF belong the 6 one-parameter families listed in Table 1 (and linear functions of them). Details can be seen in Morris.2, 3

TABLE 1. Natural Exponential Family with Quadratic Variance Function (NEF-QVF) distributions
NEF-QVF distributions
Normal N ( μ , σ 2 ) , σ 2 known, Ω = ( , ) V ( μ ) = σ 2 (constant variance function)
Poisson P o ( μ ) , Ω = ( 0 , ) V ( μ ) = μ (linear variance function)
Gamma G ( r , λ ) , r known, μ = r λ , Ω = ( 0 , ) V ( μ ) = μ 2 / r (quadratic variance function)
Binomial B(r, p), r known, μ = r p , Ω = ( 0 , r ) V ( μ ) = μ 2 / r + μ (quadratic variance function)
Negative binomial NB(r, p) , r known, μ = r ( 1 p ) / p , Ω = ( 0 , ) V ( μ ) = μ 2 / r + μ (quadratic variance function)
Generalized hyperbolic secant GHS ( r , λ ) , r known, μ = r λ , Ω = ( , ) V ( μ ) = μ 2 / r + r (quadratic variance function)

3.1 Results when there exists a sufficient and complete statistic

In this subsection we highlight those applications in which we have a sufficient and complete statistic for the parameter and the distributions are parameterized in terms of the mean, μ .

Let X1, … , Xn, with n ≥ 1, be a simple random sample (srs) from (7). Then S n = i = 1 n X i is a sufficient and complete statistic for μ whose density is
f n , μ ( s ) = exp { s θ ( μ ) n Ψ ( θ ( μ ) ) } , μ Ω , ()
with respect to the n-fold convolution measure ν n = ν × n ) × ν .
Let L ν n 2 = T n : T n 2 ( s ) f n , μ ( s ) d ν n ( s ) < be the space of Borel-measurable square integrable functions of Sn. For each μ Ω , L ν n 2 is a Hilbert space with the inner product
T 1 , T 2 n , μ = E T 1 ( S n ) T 2 ( S n ) , T 1 , T 2 L ν n 2 , ()
and norm induced
T 1 n , μ 2 = E T 1 2 ( S n ) .
As usual in the theory of L2-spaces, two functions T 1 , T 2 L ν n 2 will be considered as equivalent if T1(Sn) = T2(Sn) ν n -a.s. (i.e., ν n { T 1 T 2 } = 0 ).

Additionally, if the srs, under consideration, is from a NEF-QVF( μ , V ( μ ) ), then S n = i = 1 n X i follows a NEF-QVF( n μ , n V ( μ ) ).

The importance of having a quadratic variance function is that in this case:
  • (i)

    An OPS system on L ν n 2 is given by

    p j , n ( s ; μ ) = V j ( μ ) d j d μ j f n ( s ; μ ) 1 f n ( s ; μ ) , j 0 . ()

    In particular:

    p 0 , n ( s ; μ ) = 1 , p 1 , n ( s ; μ ) = ( s n μ ) , p 2 , n ( s ; μ ) = ( s n μ ) 2 V ( μ ) ( s n μ ) n V ( μ ) .

  • (ii)

    Since the polynomials given in (12) are an OPS system on L ν n 2 , every T n L ν n 2 admits an expansion in terms of the OPS p j , n j 0

    T n ( s ) = j = 0 a j , n ( μ ) p j , n ( s ; μ ) , μ Ω , ()
    where the Fourier coefficients a j , n ( μ ) are given by
    a j , n ( μ ) = T n , p j , n n , μ p j , n n , μ 2 ,
    · , · n , μ denotes the inner product introduced in (11) and · n , μ 2 the associated norm.

Moreover a given Tn belongs to L ν n 2 if and only if the coefficients a j , n ( μ ) , previously defined, verify
j = 0 a j , n 2 ( μ ) p j , n n , μ 2 < ,
and in this case the series j = 0 a j , n ( μ ) p j , n ( · ; μ ) converges in L ν n 2 -sense to Tn (Abbey and David1).

Lemma 1.The polynomials defined in (12) verify the following properties:

  • (i)

    pk, n is a polynomial in ( s n μ ) of degree k with leading term ( s n μ ) k .

  • (ii)

    Orthogonality relation

    E p k , n ( S n ; μ ) p j , n ( S n ; μ ) = δ k j j ! β j , n V j ( μ ) , ()
    δ k j is the Kronecker delta, β j , n = i = 0 j 1 ( n + i v 2 ) for j ≥ 1, β 0 , n = 1 , and v2 is the coefficient of μ 2 in (9).

  • (iii)

    For any positive integers 1 ≤ m ≤ n

    E p k , n ( S n ; μ ) | S m = p k , m ( S m ; μ ) (a.s.) , k 0 . ()

Remark. From (14), E p k , n ( S n ; μ ) = δ k 0 , i.e. all the polynomials, but p0, n, have mean zero, and their norms are
p k , n n , μ 2 = E p k , n 2 = k ! β k , n V k ( μ ) . ()
In particular
p 1 , n n , μ 2 = n V ( μ ) p 2 , n n , μ 2 = 2 n ( n + v 2 ) V 2 ( μ ) .

The OPS in the NEF-QVF are well known. They are listed in Table 2.

TABLE 2. Orthogonal polynomials in Natural Exponential Family with Quadratic Variance Function (NEF-QVF) distributions
NEF-QVF distributions
Normal N ( μ , σ 2 ) , σ 2 known Hermite
Poisson P o ( μ ) Charlier
Gamma G ( r , λ ) , r known (generalized) Laguerre
Binomial B(r, p), r known Kratchouk
Negative binomial NB(r, p) , r known Meixner of first kind
Generalized hyperbolic secant GHS ( r , λ ) , r known Pollaczek

4 UMVUE IN THE NEF-QVF

Since Sn is a complete sufficient statistic for μ , if there exists, for a given sample size n, the UMVUE, Tn, of a parametric function h ( μ ) then from the Lehmann–Scheffé Theorem Tn = Tn(Sn). Recall that UMVUE means Uniformly Minimum Variance Estimator, that is, if T n = UMVUE n ( h ( μ ) ) then for all μ Ω : E [ T n ( S n ) ] = h ( μ ) , T n L ν n 2 and Var μ T n ( S n ) V a r μ U n ( X _ ) , with U n ( X _ ) any other unbiased estimator of h ( μ ) with finite variance. The expansion of Tn in terms of the OPS previously introduced is given by
T n ( S n ) = j = 0 h ( j ) ( μ ) j ! β j , n p j , n ( S n ; μ ) ( a . s . ) , ()
where β j , n = k = 0 j 1 ( n + k v 2 ) for j ≥ 1 and β 0 , n = 1 .
From (17), we can obtain
  1. The variance of Tn(Sn).
    V a r μ T n ( S n ) = j = 1 h ( j ) ( μ ) j ! β j , n 2 p j , n n , μ 2 .
  2. Lower bounds for the variance of Tn(Sn).
    B k , n ( μ ) = j = 1 k h ( j ) ( μ ) j ! β j , n 2 p j , n n , μ 2 .
  3. The effect of an observation on Tn(Sn).

    Two robustness measures to assess the effect of a fixed observation, x ∈ support(X1), on Tn(Sn) were proposed in Barranco-Chamorro and Moreno-Rebollo.5 The measures are the conditional bias and the asymptotic mean sensitivity curve (AMSC) for Tn(Sn). They are based on the following relationship where we consider the conditional expectation of the UMVUE given an observation x, for simplicity it is assumed that X1 = x

    E [ T n ( S n ) | X 1 = x ] = j = 0 h ( j ) ( μ ) j ! β j , n p j , 1 ( x ; μ ) ( a . s . ) . μ Ω . ()
    It was proven that the conditional bias and the AMSC for Tn(Sn) depend on the parametric function under consideration, h, evaluated at the true and unknown value of the parameter, μ 0 Ω .

  4. The limit distribution of Tn(Sn). In the NEF-QVF the limit behavior of the UMVUE of h ( μ ) , Tn(Sn), depends on the order of the first nonzero derivative of h at the true and unknown value of the parameter μ 0 Ω . Specifically, let us denote by k 0 = min j 1 : h j ) ( μ 0 ) 0 . Then

    • (a)

      If k0 = 1 then

      h ( μ 0 ) 1 n V ( μ 0 ) T n ( X n ) h ( μ 0 ) N ( 0 , 1 ) , ()
      with X n = S n / n .

    • (b)

      For k0 > 1 the limit behavior depends on the OPS of degree k0. For instance, if k0 = 2 then

      2 ( n + v 2 ) V ( μ 0 ) h 2 ) ( μ 0 ) T n ( X n ) h ( μ 0 ) + V ( μ 0 ) V ( μ 0 ) ( X n μ 0 ) + 1 χ 1 2 . ()

  5. Comparisons with the MLE of h ( μ ) , h ( X n ) , can be carried out by considering the expansion of h ( X n ) in terms of the OPS and comparing coefficients in both expansions.

Additional details of these results and other uses can be seen in Abbey and David,1 López-Blázquez and Castaño-Martínez,4 and Barranco-Chamorro et al.,25 among others.

5 DISTRIBUTION OF A LINEAR COMBINATION OF INDEPENDENT CHI-SQUARE VARIABLES

The statistics employed in many tests and estimation methods can be expressed as quadratic forms in normal variables, as it can be seen for instance in Kotz et al.,26 and Coelho.27 In this section we consider two proposals, based on expansions in terms of OPS, to obtain the distribution of linear combinations of independent chi-square variables. Different situations can be considered depending on the sign of the coefficients. We focus on the case in which all weights are positive. So, let us consider
Q n = i = 1 n α i X i ,
where α i s are known positive constants, X i s are independent chi-square variables with ν i degrees of freedom.

The pdf and cdf of Qn can be obtained by using expansions in terms of Laguerre polynomials. In this section we briefly describe two of the most relevant methods existing in the literature to reach this aim: the method proposed in Castaño-Martínez and López-Blázquez28, 29 and the method proposed in Ha and Provost.30

5.1 Method 1

This method is based on the inverse Laplace transform and the property of the uniqueness of the UMVUE in the Gamma distribution. It was proposed in Castaño-Martínez and López-Blázquez.28, 29

First, we recall UMVU-estimation in the Gamma distribution. Let Y G a ( p , λ ) with shape parameter p > 0 and rate parameter λ > 0 . So in this subsection the probability density function (pdf) of Y is written as
g ( y ) = λ p Γ ( p ) y p 1 e λ y , y > 0 , p > 0 , λ > 0 .
μ = E [ Y ] = p / λ and V a r [ Y ] = p λ 2 = μ 2 p .
In order to apply the method which uses OPS, the distribution must be reparameterized in terms of the mean, μ = p / λ . For every μ > 0 we should consider the space of Borel-measurable square integrable functions
L ν 2 = T : T 2 ( s ) f μ ( s ) d ν ( s ) < ,
with f μ ( s ) = g p / λ ( s ) .
A parametric function h ( μ ) is UMVU-estimable if there exists a function T L ν 2 such that
E [ T ( X ) ] = h ( μ ) , μ > 0 .
Since the gamma distribution belongs to the NEF-QVF, the UMVUE of a given parametric function can be obtained in terms of an OPS, in this case, the generalized Laguerre polynomials. So, given h ( μ ) a UMVU-estimable function, then its UMVUE can be obtained as
T ( x ) = j = 0 ( μ ) j h ( j ) ( μ ) ( p ) j L j ( p 1 ) p x μ , ()
where (p)j = p(p + 1) … (p + j − 1), h ( j ) ( μ ) = d j d μ j h ( μ ) , and L j ( p 1 ) denotes the jth generalized Laguerre polynomial.
The UMVUE in the Gamma distribution can also be obtained by using the inverse Laplace transform, as it can be seen for instance in López-Blázquez et al..31 The unbiasedness condition is considered as starting point for this approach
E [ T ( X ) ] = h ( μ ) , μ > 0 .
An alternative expression for T(x) can be given based on the inverse Laplace transform (denoted as 1 )
T ( x ) = Γ ( p ) x p 1 1 p μ p h ( μ ) ( x ) , x > 0 . ()

Equations (21) and (22) were applied, as follows, in Castaño-Martínez and López-Blázquez,28 taking T(x) = f(x) the pdf of a quadratic form.

Note that Q n = i = 1 n α i X i with α i > 0 , X i χ ν i 2 , is a sum of independent non identically distributed Gamma variables. Since
α i X i G a ν i 2 , 1 2 α i ,
the moment generating function of Qn is
M Q n ( t ) = i = 1 n ( 1 2 α i t ) ν i / 2 , t < 1 / 2 α i , i = 1 , , n .
Let f be the pdf of Qn. Then the Laplace transform of f(x) is given by
( f ( x ) ) ( λ ) = i = 1 n ( 1 + 2 α i λ ) ν i / 2 = G ( λ ) .
To get a better approximation, consider
H ( λ ) = G λ 1 2 β = β ν / 2 i = 1 n ( β + α i ( λ 1 ) ) ν i / 2 , β > 0 , ν = i = 1 n ν i .
By using standard properties of Laplace transforms
f ( x ) = 1 ( G ( λ ) ) ( x ) = 1 ( H ( 1 + 2 β λ ) ) ( x ) = e x 2 β 2 β 1 ( H ( λ ) ) x 2 β .
On the other hand, the expansion of f(x) in terms of Laguerre polynomials is given by
f ( x ) = e x 2 β ( 2 β ) ν / 2 x ( ν / 2 ) 1 Γ ( ν / 2 ) k = 0 k ! c k ( ν / 2 ) k L k ( ( ν / 2 ) 1 ) ν x 4 β μ 0 , μ 0 > 0 ,
where ck are coefficients given as functions of ν , ν i , α , β , and μ 0 .
Let us introduce the truncation error associated with previous expansions
N ( f , x , μ 0 , β ) = e x 2 β ( 2 β ) ν / 2 x ( ν / 2 ) 1 Γ ( ν / 2 ) k = N + 1 k ! c k ( ν / 2 ) k L k ( ( ν / 2 ) 1 ) ν x 4 β μ 0 . ()
By using the global upper bounds (w.r.t. n, x and α ) for the Laguerre polynomials proposed by Szëgo,32 the following upper bound for (23) was given
N ( f , x , μ 0 , β ) = e x 2 β ( 2 β ) ν / 2 x ( ν / 2 ) 1 | c 0 | Γ ( ν / 2 ) e x p ν x 8 β μ 0 k = N + 1 a k ,
with a k = ξ k ( ν / 2 ) k k ! , 0 < ξ < 1 .

As summary, it can be said that the key aspects of this proposal are: Laguerre expansions for the pdf (and cdf) of a sum of weighted central independent chi-square variables are given. The formulae depend on certain parameters. Appropriate choices of them give well-known expressions in the literature. Some new expressions were also obtained. Upper bounds for the truncation errors of these expressions can be proposed. Castaño-Martínez and López-Blázquez28 include some examples with numerical results, which show that their upper bounds can be sharper than others previously proposed in literature.

5.2 Method 2

In this subsection, we will focus on the method proposed in Ha and Provost30 to approximate the distribution of linear combinations of independent chi-square r.v.'s. These authors proposed a moment-matching method, which is based in a gamma pdf adjusted by a linear combination of Laguerre polynomials. Next we briefly recall the main points we need to reproduce their methodology.

In order to apply Ha and Provost method, the parameterization of a Gamma distribution in terms of a shape parameter α > 0 and scale parameter β > 0 should be used. So, in this subsection, the notation Y G a ( α , β ) refers to the pdf
f Y ( y ) = 1 Γ ( α ) 1 β α y α 1 e y / β , y > 0 , α > 0 , β > 0 . ()

Recall that with this notation E [ Y ] = α β and V a r [ Y ] = α β 2 .

Other basic properties, which must be taken into account, are:

If Y G a ( α , β ) then X = Y β G a ( α , 1 ) . From the pdf of X, the pdf of Y = β X is given by
f Y ( y ) = 1 β f X y β . ()
Let us now address the problem introduced at the beginning of this section, that is, to obtain the p.d.f. of
Z = j = 1 p w j Y j , ()
with Y j χ k j 2 = G a k j 2 , 2 independent r.v.'s and wj real weights. A complete study of this point should distinguish between the case where all the wj are positive and the case where there are positive and negative wj's. For brevity and simplicity, we only deal with the case where all wj > 0. The method is next illustrated.
Step 0. Approximate the distribution of Z introduced in (26) by a Gamma distribution whose parameters are such that the expectation and variance of this Gamma distribution agree with the expectation and variance of Z. Since Yj are independent χ k j 2 , E[Yj] = kj and Var[Yj] = 2kj. So
E [ Z ] = j = 1 p w j E [ Y j ] = j = 1 p w j k j .
V a r [ Z ] = j = 1 p w j 2 V a r [ Y j ] = 2 j = 1 p w j 2 k j .
Therefore the parameters of the Gamma distribution proposed as initial approximation are
β = Var [ Z ] E [ Z ] = 2 j = 1 p w j 2 k j j = 1 p w j k j , α = E [ Z ] 2 Var [ Z ] = j = 1 p w j k j 2 2 j = 1 p w j 2 k j .
We highlight that, the first approximation to the distribution of Z is by using Y G ( α , β ) with previous α and β .

5.2.1 Moment-matching method by using Laguerre polynomials.

Since Z is defined as a linear combination of independent gamma distributions, it is easy to obtain the cumulants of this variable. Let us denote by k(s) the cumulant of order s and μ ( h ) = E [ Z h ] the noncentral moment of order h of Z with s , h 0 + . For h ≥ 1, the following relationship can be used to obtain μ ( h )
μ ( h ) = i = 0 h 1 h 1 i k ( h i ) μ ( i ) . ()
In order to apply results related to (Generalized) Laguerre polynomials, next weight function is considered
w ν ( x ) = x α 1 e x = x ν e x , x > 0 , ()
with ν = α 1 .33

Note that w ν ( x ) given in (28) is, basically, the scaled version of the gamma distribution considered in Step 0, that is, (28) is almost the pdf of X = Y β and X G a ( α , 1 ) .

5.2.2 Some useful properties of (generalized) Laguerre polynomials.

To reproduce and become familiar with this method, some identities satisfied by the generalized Laguerre polynomials will be useful. Additional details can be seen in Shao et al.24, 33, 34

Next the generalized Laguerre polynomials are introduced and their most relevant properties for our purposes are listed.

The generalized Laguerre polynomial of degree k 0 + with respect to the weight function w ν ( x ) = x ν e x with x > 0 and ν > 1 , will be denoted as L k ν ( x ) .

These polynomials satisfy the following three-term recurrence relation
L k + 1 ν ( x ) = 2 k + 1 + ν x k + 1 L k ν ( x ) k + ν k + 1 L k 1 ν ( x ) , k 1 ,
with the initial conditions
L 0 ν ( x ) = 1 L 1 ν ( x ) = 1 + ν x .
L k ν ( x ) are orthogonal with respect to the weight function w ν ( x ) = x ν e x . The orthogonality relationship is
0 L m ν ( x ) L k ν ( x ) x ν e x d x = Γ ( k + ν + 1 ) k ! δ m k , m , k 0 + , ()
where δ m k denotes the Kronecker delta symbol defined as δ m k = 1 if m = k and δ m k = 0 if m ≠ k.
Also, we have the following closed formula for the generalized Laguerre polynomials
L k ν ( x ) = j = 0 k ( 1 ) j k + ν k j x j j ! . ()

Ha and Provost30 proposed two (equivalent) expressions to approximate the pdf of the quadratic form introduced in (26).

The first one is given in terms of powers of x
g X d ( x ) = c ν w ν ( x ) k = 0 d ξ ν , k x k , ()

g X d ( · ) is the Laguerre approximant, which they propose to approximate the pdf of the quadratic form, X = Z / β .

In (31), we have that c ν = 1 / Γ ( ν + 1 ) , w ν ( x ) = x ν e x (weight function of generalized Laguerre polynomials ). The coefficients ξ ν , k are obtained from moments of X = Z / β , and the coefficients of the Laguerre polynomials of degrees: 0, 1, … , d.

From the approximant to the pdf of X, the approximant to the pdf of Z, denoted as f Y d ( y ) , is given by
f Y d ( y ) = 1 β g X d y β . ()
The second expression for the approximation in (32) is given in terms of Generalized Laguerre polynomials
f Y d ( y ) = y ν e y / β β ν + 1 j = 0 d δ j ν L j ν ( y / β ) , ()
where δ j ν are obtained from μ X ( k ) = E [ X k ] with X = Z / β and in such a way that the first d noncentral moments of the approximant Yd and the moments of Z agree. Additional details about δ j ν can be seen in Ha and Provost.30

As for bounds for the truncation error by using these approximations, although Ha and Provost provided bounds for the truncation error for the pdf's and cdf's. They said that their bounds are not very tight because their error bounds depend on the moments of the distribution being approximated.

As summary of this methodology, it can be said that the distribution of Z is first approximated by a Gamma ( α , β ) whose moments match the first two moments of Z. Usually this first approximation is not good. From these previous values of ν = α 1 and β , the approximations based on Laguerre polynomials given in (31) or (33) can be obtained. We have that, on the one hand, (31) is useful to obtain the explicit expression of the cdf (because we have a polynomial in xk), note that multiplied by the pdf of a G ( α = ν + 1 , 1 )
c ν w ν ( x ) = 1 Γ ( ν + 1 ) x ν e x , x > 0 .
On the other hand, (33), which is given in terms of Generalized Laguerre polynomials, must be easier to compute by using software, for instance Mathematica.

As for the degree of the polynomial approximant, Ha and Provost30 propose to try with several values of d until they obtain close values (for instance for quantiles). To assess the performance of their approximations, they compare their proposals to the exact distribution (if it is possible), or to the simulated distribution.

5.2.3 Illustration

Let us consider
Z = j = 1 6 w j Y j ,
with w1 = w2 = 1, w3 = w4 = 2.5, w5 = w6 = 9 and Y j χ 1 2 independent and identically distributed.
Then
E [ Z ] = j = 1 p w j = 25 .
V a r [ Z ] = 2 j = 1 p w j 2 = 353 .
Therefore the parameters of the Gamma distribution are
β = V a r [ Z ] E [ Z ] = 353 25 = 14 . 12 , α = E [ Z ] 2 V a r [ Z ] = 2 5 2 353 = 1 . 77054 . ()

So, the first approximation is by using the G ( α , β ) distribution with α and β in (34).

From (33), for d = 6 the Laguerre approximant is given by
f Y d = 6 ( y ) = y ν e y / β β ν + 1 δ 0 ν + δ 1 ν L 1 ν ( y / β ) + + δ 6 ν L 6 ν ( y / β ) , ()

(with L 0 ν ( y / β ) = 1 ).

Note that δ 0 ν = 1 / Γ ( ν + 1 ) , therefore (35) is a kind of correction to the G a ( ν + 1 , β ) initially proposed in which we impose that the d = 6 first noncentral moments of Z and the Laguerre approximant, Yd, agree. The explicit expression of Laguerre approximant for d = 6 in terms of Laguerre polynomials will follow straightforward.

In terms of powers of yk is
f Y d = 6 ( y ) = y ν e y / β 0 . 00997251 + 0 . 000088465 y 9 . 1383 1 0 6 y 2 + 2 . 81797 1 0 7 y 3 3 . 59219 1 0 9 y 4 + 2 . 04193 1 0 11 y 5 4 . 35351 1 0 14 y 6 .

The multiple plot for the pdf of the initial Gamma ( α , β ) (black), and the approximants for d = 6 (red), and d = 14 (blue) are provided in Figure 1.

Details are in the caption following the image
Comparison of initial Gamma approximation and Laguerre approximants pdf's for d = 6 and 14 in Illustration
The approximated cdf can be obtained in terms of the incomplete gamma function as it is next illustrated. Taking into account that
0 y t ν e t / β t k d t = β ν + k + 1 Γ ( ν + 1 + k ) Γ ( ν + 1 + k , y / β ) .
It follows that
F Y d = 6 ( y ) = k = 0 d c k Γ ν + 1 + k , y β ,
with ck some appropriate coefficients. In this case we have that c0 = −41, 373, 543, c1 = 138, 746, 964, c2 = −123, 913, 523, c3 = 43, 374, 898, c4 = −6, 751, 190, c5 = 463, 372.3, c6 = −11, 295.86

5.2.4 Comparison to Moschopoulos technique

One of the reviewers suggests to carry out a comparison of the results previously presented with the ones in Moschopoulos.35 Moschopoulos considers Z = Y1 + ⋯ + Yn with Y i G a ( α i , β i ) independent, α i > 0 and β i > 0 the shape and scale parameters, respectively. By inverting the moment generating function of Z, he obtained a single gamma series for the pdf and cdf of Z. Moschopoulos method is implemented in R package coga (Hu et al.36, 37). In Table 3, some quantiles are given for the illustration previously presented by using Ha and Provost (d = 6 and d = 14) and Moschopoulos methods. Results in Table 3 suggest that in order to apply Ha and Provost method several values of d must be tried until obtain a fixed accuracy. For this illustration, Ha and Provost30 proposed d = 14.

TABLE 3. Comparison of quantiles
α 0.01 0.05 0.10 0.50
Ha and Provost (d = 6) 1.92384 4.63033 6.83939 20.3014
Ha and Provost (d = 14) 2.51869 5.04397 7.03708 20.0027
coga 2.57955 5.04195 7.00921 20.04002
α 0.90 0.95 0.99
Ha and Provost (d = 6) 49.0916 62.5418 91.4214
Ha and Provost (d = 14) 49.3561 61.8384 90.9503
coga 49.41845 61.89996 90.87084
About both methods, take into account next appreciations:
  • -

    Ha and Provost approach is based on an initial gamma pdf adjusted by a linear combination of Laguerre polynomials (by using a moment-matching method). The parameters of the initial gamma pdf are taken in such a way that its expectation and variance agree with the expectation and variance of Z.

  • -

    On the other hand, Moschopoulos method is based on a single gamma series based on pdf's of gamma distributions with scale parameter β 1 = min 1 i n { β i } and different shape parameters. The coefficients are computed by simple recursive relations.

These appreciations suggest that the performance of these methods may vary with the values of the parameters, α i > 0 and β i > 0 , i = 1, … , n. In this sense, it is worth mentioning that Hu et al.37 point out that Moschopoulos method computation is demanding when the variability of the scale parameters is large and the shape parameters are small.

6 CONCLUSIONS

The aim of this paper has been to illustrate certain uses of classical OPS's in Statistical Inference. Quite often, the density associated to the problem we are dealing with allows us to consider classical OPS's whose properties are well known. For the univariate and one-parameter case, the general method is described in Section 2.

On the one hand, Sections 3 and 4 are devoted to the NEF-QVF. To this family belong distributions such as the normal, Poisson, gamma, binomial, negative binomial and generalized secant hyperbolic. A pletora of results are listed in Section 4. All of them are based on the fact that given a srs, X1, … , Xn, of a NEF-QVF ( μ , V ( μ ) ) , the statistic S n = i = 1 n X i is distributed as a NEF-QVF ( n μ , n V ( μ ) ) and the properties of the OPS's associated to the density of Sn. We highlight that the method is general and the OPS's we are dealing with are classical and well known (see Table 2).

On the other hand, Section 5 is devoted to different methods to approximate the distribution of linear combinations of independent chi-square variables (or equivalently, gamma distributions). They are based on generalized Laguerre polynomials. Get to know both methodologies can be useful to obtain properties in both fields of statistics, since the gamma distribution is one of the models included in the NEF-QVF, and the generalized Laguerre polynomials one of the OPS associated, (see Table 2). In our opinion, these two branches of statistics share interesting uses and properties as it has been illustrated throughout the paper.

ACKNOWLEDGEMENTS

Authors would like to thank the anonymous referees for the careful reading of the paper. We really appreciate their helpful suggestions which contribute to improve its presentation.

    Biographies

    • Inmaculada Barranco-Chamorro. Inmaculada Barranco-Chamorro Ph.D. is Associate Professor at the Department of Statistics and OR, Faculty of Mathematics, University of Seville (SPAIN). Her doctoral thesis at the University of Sevilla, Spain, was entitled “Estimación parametrica en distribuciones no regulares” (Parametric estimation in nonregular distributions). Her major research interests are Statistical Inference, Distribution Theory, Estimation, Influence Analysis and Data Analysis in general.

    • Christos Grentzelos. Christos Grentzelos holds a Master's degree in Applied Mathematical Sciences from the National Technical University of Athens, Greece. His thesis was entitled “Statistical Techniques to Identify and Handle Outliers in Multivariate Data”, supervised by C.Caroni and I.Barranco-Chamorro. This work was carried out at the University of Seville under an Inter-Institutional Agreement of Higher Education Student and Staff Mobility between both universities. His current interests lie in multivariate data analysis and data mining.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.