Volume 2008, Issue 1 639145
Research Article
Open Access

A Strong Limit Theorem for Functions of Continuous Random Variables and an Extension of the Shannon-McMillan Theorem

Gaorong Li

Corresponding Author

Gaorong Li

School of Finance and Statistics East China Normal University Shanghai 200241, China , ecnu.edu.cn

College of Applied Sciences Beijing University of Technology Beijing 100022, China , bjpu.edu.cn

Search for more papers by this author
Shuang Chen

Shuang Chen

School of Sciences Hebei University of Technology Tianjin 300130, China , hebut.edu.cn

Search for more papers by this author
Sanying Feng

Sanying Feng

College of Mathematics and Science Luoyang Normal University Henan 471022, China , lynu.edu.cn

Search for more papers by this author
First published: 11 May 2008
Citations: 3
Academic Editor: Onno Boxma

Abstract

By means of the notion of likelihood ratio, the limit properties of the sequences of arbitrary-dependent continuous random variables are studied, and a kind of strong limit theorems represented by inequalities with random bounds for functions of continuous random variables is established. The Shannon-McMillan theorem is extended to the case of arbitrary continuous information sources. In the proof, an analytic technique, the tools of Laplace transform, and moment generating functions to study the strong limit theorems are applied.

1. Introduction

Let {Xn,n ≥ 1} be a sequence of arbitrary continuous real random variables on the probability space (Ω,,P) with the joint density function
(1.1)
where xi ∈ (−,), 1 ≤ in. Let Q be another probability measure on , and {Xn,n ≥ 1} is a sequence of independent random variables on the probability space (Ω,,Q) with the marginal density functions gk(xk) (1 ≤ kn), and let
(1.2)
In order to indicate the deviation between {Xn,n ≥ 1} on the probability measure P and Q, we first introduce the following definitions.

Definition 1.1. Let {Xn,n ≥ 1} be a sequence of random variables with joint distribution (1.1), and let gk(xk) (k = 1, 2, …,n) be defined by (1.2). Let

(1.3)
In statistical terms, Zn(ω) is called the likelihood ratio, which is of fundamental importance in the theory of testing the statistical hypotheses (cf. [1, page 388]; [2, page 483]).

The random variable
(1.4)
is called asymptotic logarithmic likelihood ratio, relative to the product of marginal distribution of (1.2), of Xn, n ≥ 1, where ln  is the natural logarithm, ω is the sample point. For the sake of brevity, we denote Xk(ω) by Xk.

Although r(ω) is not a proper metric between probability measures, we nevertheless think of it as a measure of “dissimilarity” between their joint distribution fn(x1, … ,xn) and the product πn(x1, … ,xn) of their marginals.

Obviously, r(ω) = 0, a.s. if and only if {Xn,n ≥ 1} are independent.

A stochastic process of fundamental importance in the theory of testing hypotheses is the sequence of likelihood ratio. In view of the above discussion of the asymptotic logarithmic likelihood ratio, it is natural to think of r(ω) as a measure how far (the random deviation of) Xn is from being independent, how dependent they are. The smaller r(ω) is, the smaller the deviation is (cf. [35]).

In [3], the strong deviation theorems for discrete random variables were discussed by using the generating function method. Later, the approach of Laplace transform to study the strong limit theorems was first proposed by Liu [4]. Yang [6] further studied the limit properties for Markov chains indexed by a homogeneous tree through the analytic technique. Many comprehensive works may be found in Liu [7]. The purpose of this paper is to establish a kind of strong deviation theorems represented by inequalities with random bounds for functions of arbitrary continuous random variables, by combining the analytic technique with the method of Laplace transform, and to extend the strong deviation theorems to the differential entropy for arbitrary-dependent continuous information sources in more general settings.

Definition 1.2. Let {hn(xn), n ≥ 1} be a sequence of nonnegative orel measurable functions defined on , the Laplace transform of random variables hn(Xn) on the probability space (Ω,,Q) is defined by

(1.5)
where EQ denotes the expectation under Q.

We have the following assumptions in this paper.
  • (1)

    Assume that there exists s0 ∈ (0, ) such that

    (1.6)

  • (2)

    Assume M > 0 is a constant, satisfying

    (1.7)

In order to prove our main results, we first give a lemma, and it will be shown that it plays a central role in the proofs.

Lemma 1.3. Let fn(x1, … ,xn), gn(x1, … ,xn) be two probability functions on (Ω,,P), let

(1.8)
then
(1.9)

Proof. By [8], {Tn,,n ≥ 1} is a nonnegative martingale and ETn = 1, we have by the Doob martingale convergence theorem, there exists an integral random variable T(ω), such that TnT, a.s. and (1.9) follows.

2. Main Results

Theorem 2.1. Let {Xn,n ≥ 1}, Zn(ω), r(ω), fn(s) be defined as before, and under the assumptions of (1) and (2), let

(2.1)
Then
(2.2)
(2.3)
where
(2.4)
(2.5)
(2.6)

Remark 2.2. Let

(2.7)
then
(2.8)

Proof. Let s be an arbitrary real number in (−s0,s0), let

(2.9)
then , and let
(2.10)
Therefore, qn(s; x1, … ,xn) is an n multivariate probability density function, let
(2.11)
By Lemma 1.3, there exists a set A(s) such that P(A(s)) = 1, so we have
(2.12)
By (1.3), (2.9), (2.11), and (2.12), we have
(2.13)
Therefore,
(2.14)
By (2.13) and (1.4), the property of the superior limit
(2.15)
and the inequality ln xx − 1  (x > 0), we have
(2.16)
By the inequality 0 ≤ ex − 1 − x ≤ (1/2)x2e|x|, which can be found in [9], we have
(2.17)
By (2.5) and (2.17), we have
(2.18)
It is easy to see that φ(x) = txx2 (t > 1) attains its largest value φ(−2/ln t) = 4e−2/(ln t) 2 on the interval (−,0], and φ(x) = txx2 (0 < t < 1) attains its largest value φ(−2/ln t) = 4e−2/(ln t) 2 on the interval [0, ), we have
(2.19)
(2.20)
Let 0 < s < s0 in (2.18), by (2.19) and (2.1), we obtain
(2.21)
Dividing the two sides of (2.21) by −s, we obtain
(2.22)
By (2.14) and 0 < s < s0, obviously ϕ(s,r(ω)) ≤ 0, hence α(r(ω)) ≤ 0. Let Q+ be the set of rational numbers in the interval (0, s0), and let then P(A*) = 1. By (2.22), then we have
(2.23)
It is easy to see that ϕ(s,x) is a continuous function with respect to s on the interval (0, s0). For each ωA*A(0) (0 ≤ r(ω) < ), take sn(ω) ∈ Q+, n = 1, 2, …, such that
(2.24)
By (2.23), (2.24), and (2.8), we have
(2.25)
Since P(A*A(0)) = 1, (2.2) follows from (2.25).

Let −s0 < s < 0 in (2.18), by (2.20) and (2.1), we have

(2.26)
By (2.14) and −s0 < s < 0, obviously ϕ(s,r(ω)) ≥ 0, hence β(r(ω)) ≥ 0. Let Q be the set of rational numbers in the interval (−s0,0), and let then P(A*) = 1. Then we have by (2.26)
(2.27)
It is clear that ϕ(s,x) is a continuous function with respect to s on the interval (−s0,0). For each ωA*A(0) (0 ≤ r(ω) < ), take λn(ω) ∈ Q, n = 1, 2, …, such that
(2.28)
By (2.27) and (2.28), we have
(2.29)
Since P(A*A(0)) = 1, (2.3) follows from (2.29).

By (2.4), (2.5), and (2.14), if x > 0, we have

(2.30)
If x = 0, we have
(2.31)
Noticing that β(x) ≥ 0, (x ≥ 0), (2.6) follows from (2.30) and (2.31).

Corollary 2.3. If P = Q, or {Xn,n ≥ 1} is a sequence of independent random variables, and under the assumptions of (1) and (2), then

(2.32)

Proof. In this case, and r(ω) = 0 a.s. Hence, (2.32) follows directly from (2.2) and (2.3).

3. An Extension of the Shannon-McMillan Theorem

In order to understand better, we first introduce some definitions in information theory in this section.

Let {Xn,n ≥ 1} be a sequence produced by an arbitrary continuous information source on the probability space (Ω,,P) with the joint density function
(3.1)
For the sake of brevity, we denote fn > 0, and Xk stands for Xk(ω). Let
(3.2)
where ω is the sample point, pn(ω) is called the sample entropy or the entropy density of {Xk,1 ≤ kn}. Also let Q be another probability measure on with the density function
(3.3)
Let
(3.4)
Ln(ω), L(ω), and D(fnqn) are called the sample relative entropy, the sample relative entropy rate, and the relative entropy, respectively, relative to the reference density function qn(x1, … ,xn). Indeed, they all are the measure of the deviation between the true joint distribution density function fn(x1, … ,xn) and the reference distribution density function qn(x1, … ,xn) (cf. [10, pages 12, 18]).

A question of importance in information theory is the study of the limit properties of the relative entropy density fn(ω). Since Shannon′s initial work was published (cf. [11]), there has been a great deal of investigation about this question (e.g., cf. [1220]).

In this paper, a class of small deviation theorems (i.e., the strong limit theorems represented by inequalities) is established by using the analytical technique, and an extension of the Shannon-McMillan theorem to the arbitrary-dependent continuous information sources is given. Especially, an approach of applying the tool of Laplace transform to the study of the strong deviation theorems on the differential entropy is proposed.

Let hk(xk) = −ln gk(xk) (1 ≤ kn,n = 1, 2, …) in (1.5), then we give the following definitions.

Definition 3.1. The Laplace transform of −ln gk(xk) is defined by

(3.5)

Definition 3.2. The differential entropy for continuous random variables Xk is defined by

(3.6)

In the following theorem, let {Xn,n ≥ 1} be independent random variables with respect to Q, then the reference density function , and let hk(Xk) = −ln gk(Xk) (1 ≤ kn) in Theorem 2.1.

Theorem 3.3. Let {Xn,n ≥ 1}, Ln(ω), L(ω), fn(s) be given as above, and under the assumptions of (1) and (2), let

(3.7)
Then
(3.8)
where
(3.9)
(3.10)
(3.11)

Remark 3.4. Let

(3.12)
then
(3.13)

Corollary 3.5. Let pn(ω) be defined by (3.2). Under the condition of Theorem 3.3, then

(3.14)
where h(X1, … ,Xn) = E[−ln fn(X1, … ,Xn)] is the differential entropy for (X1, … ,Xn), and
(3.15)
where α(L(ω)), β(L(ω)) are denoted by (3.9)–(3.13).

Corollary 3.6. If P = Q, or {Xn,n ≥ 1} are independent random variables, and there exists s0 > 0, such that (2.1) holds, then

(3.16)

Acknowledgments

This research is supported by the National Natural Science Foundation of China (Grants nos. 10671052 and 10571008), the Natural Science Foundation of Beijing (Grant no. 1072004), Funding Project for Academic Human Resources Development in Institutions of Higher Learning Under the Jurisdiction of Beijing Municipality, the Basic Research and Frontier Technology Foundation of Henan (Grant no. 072300410090), and the Natural Science Research Project of Henan (Grant no. 2008B110009). The authors would like to thank the editor and the referees for helpful comments, which helped to improve an earlier version of the paper.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.