Volume 2008, Issue 1 639145

Research Article

Open Access

A Strong Limit Theorem for Functions of Continuous Random Variables and an Extension of the Shannon-McMillan Theorem

Gaorong Li,

Corresponding Author

Gaorong Li

[email protected]

School of Finance and Statistics East China Normal University Shanghai 200241, China , ecnu.edu.cn

College of Applied Sciences Beijing University of Technology Beijing 100022, China , bjpu.edu.cn

Search for more papers by this author

Shuang Chen,

Shuang Chen

School of Sciences Hebei University of Technology Tianjin 300130, China , hebut.edu.cn

Search for more papers by this author

Sanying Feng,

Sanying Feng

College of Mathematics and Science Luoyang Normal University Henan 471022, China , lynu.edu.cn

Search for more papers by this author

Gaorong Li,

Corresponding Author

Gaorong Li

[email protected]

School of Finance and Statistics East China Normal University Shanghai 200241, China , ecnu.edu.cn

College of Applied Sciences Beijing University of Technology Beijing 100022, China , bjpu.edu.cn

Search for more papers by this author

Shuang Chen,

Shuang Chen

School of Sciences Hebei University of Technology Tianjin 300130, China , hebut.edu.cn

Search for more papers by this author

Sanying Feng,

Sanying Feng

College of Mathematics and Science Luoyang Normal University Henan 471022, China , lynu.edu.cn

Search for more papers by this author

First published: 11 May 2008

https://doi.org/10.1155/2008/639145

Citations: 3

Academic Editor: Onno Boxma

Share a link

Email
Wechat
Bluesky

Abstract

By means of the notion of likelihood ratio, the limit properties of the sequences of arbitrary-dependent continuous random variables are studied, and a kind of strong limit theorems represented by inequalities with random bounds for functions of continuous random variables is established. The Shannon-McMillan theorem is extended to the case of arbitrary continuous information sources. In the proof, an analytic technique, the tools of Laplace transform, and moment generating functions to study the strong limit theorems are applied.

1. Introduction

Let {X_n, n ≥ 1} be a sequence of arbitrary continuous real random variables on the probability space (Ω,ℱ,P) with the joint density function

(1.1)

where x_i ∈ (−∞,∞), 1 ≤ i ≤ n. Let Q be another probability measure on ℱ, and {X_n, n ≥ 1} is a sequence of independent random variables on the probability space (Ω,ℱ,Q) with the marginal density functions g_k(x_k) (1 ≤ k ≤ n), and let

(1.2)

In order to indicate the deviation between {X_n, n ≥ 1} on the probability measure P and Q, we first introduce the following definitions.

Definition 1.1. Let {X_n, n ≥ 1} be a sequence of random variables with joint distribution (1.1), and let g_k(x_k) (k = 1, 2, …,n) be defined by (1.2). Let

(1.3)

In statistical terms, Z_n(ω) is called the likelihood ratio, which is of fundamental importance in the theory of testing the statistical hypotheses (cf. [1, page 388]; [2, page 483]).

The random variable

(1.4)

is called asymptotic logarithmic likelihood ratio, relative to the product of marginal distribution of (1.2), of X_n, n ≥ 1, where ln is the natural logarithm, ω is the sample point. For the sake of brevity, we denote X_k(ω) by X_k.

Although r(ω) is not a proper metric between probability measures, we nevertheless think of it as a measure of “dissimilarity” between their joint distribution f_n(x₁, … ,x_n) and the product π_n(x₁, … ,x_n) of their marginals.

Obviously, r(ω) = 0, a.s. if and only if {X_n, n ≥ 1} are independent.

A stochastic process of fundamental importance in the theory of testing hypotheses is the sequence of likelihood ratio. In view of the above discussion of the asymptotic logarithmic likelihood ratio, it is natural to think of r(ω) as a measure how far (the random deviation of) X_n is from being independent, how dependent they are. The smaller r(ω) is, the smaller the deviation is (cf. [3–5]).

In [3], the strong deviation theorems for discrete random variables were discussed by using the generating function method. Later, the approach of Laplace transform to study the strong limit theorems was first proposed by Liu [4]. Yang [6] further studied the limit properties for Markov chains indexed by a homogeneous tree through the analytic technique. Many comprehensive works may be found in Liu [7]. The purpose of this paper is to establish a kind of strong deviation theorems represented by inequalities with random bounds for functions of arbitrary continuous random variables, by combining the analytic technique with the method of Laplace transform, and to extend the strong deviation theorems to the differential entropy for arbitrary-dependent continuous information sources in more general settings.

Definition 1.2. Let {h_n(x_n), n ≥ 1} be a sequence of nonnegative ℬ orel measurable functions defined on ℛ, the Laplace transform of random variables h_n(X_n) on the probability space (Ω,ℱ,Q) is defined by

(1.5)

where E_Q denotes the expectation under Q.

We have the following assumptions in this paper.

(1)
Assume that there exists s₀ ∈ (0, ∞) such that
(1.6)
(2)
Assume M > 0 is a constant, satisfying
(1.7)

In order to prove our main results, we first give a lemma, and it will be shown that it plays a central role in the proofs.

Lemma 1.3. Let f_n(x₁, … ,x_n), g_n(x₁, … ,x_n) be two probability functions on (Ω,ℱ,P), let

(1.8)

then

(1.9)

Proof. By [8], {T_n,ℱ, n ≥ 1} is a nonnegative martingale and ET_n = 1, we have by the Doob martingale convergence theorem, there exists an integral random variable T_∞(ω), such that T_n → T_∞, a.s. and (1.9) follows.

2. Main Results

Theorem 2.1. Let {X_n, n ≥ 1}, Z_n(ω), r(ω), f_n(s) be defined as before, and under the assumptions of (1) and (2), let

(2.1)

Then

(2.2)

(2.3)

where

(2.4)

(2.5)

(2.6)

Remark 2.2. Let

(2.7)

then

(2.8)

Proof. Let s be an arbitrary real number in (−s₀,s₀), let

(2.9)

then

, and let

(2.10)

Therefore, q_n(s; x₁, … ,x_n) is an n multivariate probability density function, let

(2.11)

By Lemma 1.3, there exists a set A(s) such that P(A(s)) = 1, so we have

(2.12)

By (1.3), (2.9), (2.11), and (2.12), we have

(2.13)

Therefore,

(2.14)

By (2.13) and (1.4), the property of the superior limit

(2.15)

and the inequality ln x ≤ x − 1 (x > 0), we have

(2.16)

By the inequality 0 ≤ e^x − 1 − x ≤ (1/2)x²e^|x|, which can be found in [9], we have

(2.17)

By (2.5) and (2.17), we have

(2.18)

It is easy to see that φ(x) = t^xx² (t > 1) attains its largest value φ(−2/ln t) = 4e⁻²/(ln t) ² on the interval (−∞,0], and φ(x) = t^xx² (0 < t < 1) attains its largest value φ(−2/ln t) = 4e⁻²/(ln t) ² on the interval [0, ∞), we have

(2.19)

(2.20)

Let 0 < s < s₀ in (2.18), by (2.19) and (2.1), we obtain

(2.21)

Dividing the two sides of (2.21) by −s, we obtain

(2.22)

By (2.14) and 0 < s < s₀, obviously ϕ(s,r(ω)) ≤ 0, hence α(r(ω)) ≤ 0. Let Q⁺ be the set of rational numbers in the interval (0, s₀), and let

then P(A^*) = 1. By (2.22), then we have

(2.23)

It is easy to see that ϕ(s,x) is a continuous function with respect to s on the interval (0, s₀). For each ω ∈ A^*∩A(0) (0 ≤ r(ω) < ∞), take s_n(ω) ∈ Q⁺, n = 1, 2, …, such that

(2.24)

By (2.23), (2.24), and (2.8), we have

(2.25)

Since P(A^*∩A(0)) = 1, (2.2) follows from (2.25).

Let −s₀ < s < 0 in (2.18), by (2.20) and (2.1), we have

(2.26)

By (2.14) and −s₀ < s < 0, obviously ϕ(s,r(ω)) ≥ 0, hence β(r(ω)) ≥ 0. Let Q⁻ be the set of rational numbers in the interval (−s₀,0), and let

then P(A_*) = 1. Then we have by (2.26)

(2.27)

It is clear that ϕ(s,x) is a continuous function with respect to s on the interval (−s₀,0). For each ω ∈ A_*∩A(0) (0 ≤ r(ω) < ∞), take λ_n(ω) ∈ Q⁻, n = 1, 2, …, such that

(2.28)

By (2.27) and (2.28), we have

(2.29)

Since P(A_*∩A(0)) = 1, (2.3) follows from (2.29).

By (2.4), (2.5), and (2.14), if x > 0, we have

(2.30)

If x = 0, we have

(2.31)

Noticing that β(x) ≥ 0, (x ≥ 0), (2.6) follows from (2.30) and (2.31).

Corollary 2.3. If P = Q, or {X_n, n ≥ 1} is a sequence of independent random variables, and under the assumptions of (1) and (2), then

(2.32)

Proof. In this case, and r(ω) = 0 a.s. Hence, (2.32) follows directly from (2.2) and (2.3).

3. An Extension of the Shannon-McMillan Theorem

In order to understand better, we first introduce some definitions in information theory in this section.

Let {X_n, n ≥ 1} be a sequence produced by an arbitrary continuous information source on the probability space (Ω,ℱ,P) with the joint density function

(3.1)

For the sake of brevity, we denote f_n > 0, and X_k stands for X_k(ω). Let

(3.2)

where ω is the sample point, p_n(ω) is called the sample entropy or the entropy density of {X_k, 1 ≤ k ≤ n}. Also let Q be another probability measure on ℱ with the density function

(3.3)

Let

(3.4)

L_n(ω), L(ω), and D(f_n∥q_n) are called the sample relative entropy, the sample relative entropy rate, and the relative entropy, respectively, relative to the reference density function q_n(x₁, … ,x_n). Indeed, they all are the measure of the deviation between the true joint distribution density function f_n(x₁, … ,x_n) and the reference distribution density function q_n(x₁, … ,x_n) (cf. [10, pages 12, 18]).

A question of importance in information theory is the study of the limit properties of the relative entropy density f_n(ω). Since Shannon′s initial work was published (cf. [11]), there has been a great deal of investigation about this question (e.g., cf. [12–20]).

In this paper, a class of small deviation theorems (i.e., the strong limit theorems represented by inequalities) is established by using the analytical technique, and an extension of the Shannon-McMillan theorem to the arbitrary-dependent continuous information sources is given. Especially, an approach of applying the tool of Laplace transform to the study of the strong deviation theorems on the differential entropy is proposed.

Let h_k(x_k) = −ln g_k(x_k) (1 ≤ k ≤ n, n = 1, 2, …) in (1.5), then we give the following definitions.

Definition 3.1. The Laplace transform of −ln g_k(x_k) is defined by

(3.5)

Definition 3.2. The differential entropy for continuous random variables X_k is defined by

(3.6)

In the following theorem, let {X_n, n ≥ 1} be independent random variables with respect to Q, then the reference density function , and let h_k(X_k) = −ln g_k(X_k) (1 ≤ k ≤ n) in Theorem 2.1.

Theorem 3.3. Let {X_n, n ≥ 1}, L_n(ω), L(ω), f_n(s) be given as above, and under the assumptions of (1) and (2), let

(3.7)

Then

(3.8)

where

(3.9)

(3.10)

(3.11)

Remark 3.4. Let

(3.12)

then

(3.13)

Corollary 3.5. Let p_n(ω) be defined by (3.2). Under the condition of Theorem 3.3, then

(3.14)

where h(X₁, … ,X_n) = E[−ln f_n(X₁, … ,X_n)] is the differential entropy for (X₁, … ,X_n), and

(3.15)

where α(L(ω)), β(L(ω)) are denoted by (3.9)–(3.13).

Corollary 3.6. If P = Q, or {X_n, n ≥ 1} are independent random variables, and there exists s₀ > 0, such that (2.1) holds, then

(3.16)

Acknowledgments

This research is supported by the National Natural Science Foundation of China (Grants nos. 10671052 and 10571008), the Natural Science Foundation of Beijing (Grant no. 1072004), Funding Project for Academic Human Resources Development in Institutions of Higher Learning Under the Jurisdiction of Beijing Municipality, the Basic Research and Frontier Technology Foundation of Henan (Grant no. 072300410090), and the Natural Science Research Project of Henan (Grant no. 2008B110009). The authors would like to thank the editor and the referees for helpful comments, which helped to improve an earlier version of the paper.

References

1 Laha R. G. and Rohatgi V. K., Probability Theory, 1979, John Wiley & Sons, New York, NY, USA, Wiley Series in Probability and Mathematical Statistic, MR534143, ZBL0409.60001.
Google Scholar
2 Billingsley P., Probability and Measure, 1986, 2nd edition, John Wiley & Sons, New York, NY, USA, Wiley Series in Probability and Mathematical Statistics, MR830424, ZBL0649.60001.
Google Scholar
3 Liu W., Relative entropy densities and a class of limit theorems of the sequence of m-valued random variables, The Annals of Probability. (1990) 18, no. 2, 829–839, MR1055435, https://doi.org/10.1214/aop/1176990860, ZBL0711.60026.
10.1214/aop/1176990860
Web of Science® Google Scholar
4 Liu W., A class of strong deviation theorems and Laplace transform methods, Chinese Science Bulletin. (1998) 43, no. 10, 1036–1041, MR1667164.
Google Scholar
5 Liu W. and Wang Y., A strong limit theorem expressed by inequalities for the sequences of absolutely continuous random variables, Hiroshima Mathematical Journal. (2002) 32, no. 3, 379–387, MR1953730, ZBL1016.60035.
10.32917/hmj/1151007488
Google Scholar
6 Yang W., Some limit properties for Markov chains indexed by a homogeneous tree, Statistics & Probability Letters. (2003) 65, no. 3, 241–250, MR2018036, ZBL1068.60045, https://doi.org/10.1016/j.spl.2003.04.001.
10.1016/j.spl.2003.04.001
Web of Science® Google Scholar
7 Liu W., Strong Deviation Theorems and Analytic Method, 2003, Science Press, Beijing, China.
Google Scholar
8 Doob J. L., Stochastic Processes, 1953, John Wiley & Sons, New York, NY, USA, MR0058896, ZBL0053.26802.
Web of Science® Google Scholar
9 Liu W. and Wang J., A strong limit theorem on gambling systems, Journal of Multivariate Analysis. (2003) 84, no. 2, 262–273, MR1965221, https://doi.org/10.1016/S0047-259X(02)00054-4, ZBL1016.60033.
10.1016/S0047-259X(02)00054-4
Web of Science® Google Scholar
10 Cover T. M. and Thomas J. A., Elements of Information Theory, 1991, John Wiley & Sons, New York, NY, USA, Wiley Series in Telecommunications, MR1122806, ZBL0762.94001.
10.1002/0471200611
Google Scholar
11 Shannon C. E., A mathematical theory of communication, The Bell System Technical Journal. (1948) 27, 379–423, 623–656, MR0026286.
10.1002/j.1538-7305.1948.tb01338.x
Web of Science® Google Scholar
12 Algoet P. H. and Cover T. M., A sandwich proof of the Shannon-McMillan-Breiman theorem, The Annals of Probability. (1988) 16, no. 2, 899–909, MR929085, https://doi.org/10.1214/aop/1176991794, ZBL0653.28013.
10.1214/aop/1176991794
Web of Science® Google Scholar
13 Barron A. R., The strong ergodic theorem for densities: generalized Shannon-McMillan-Breiman theorem, The Annals of Probability. (1985) 13, no. 4, 1292–1303, MR806226, https://doi.org/10.1214/aop/1176992813, ZBL0608.94001.
10.1214/aop/1176992813
Web of Science® Google Scholar
14 Chung K. L., A note on the ergodic theorem of information theory, The Annals of Mathematical Statistics. (1961) 32, no. 2, 612–614, MR0131782, https://doi.org/10.1214/aoms/1177705069, ZBL0115.35503.
10.1214/aoms/1177705069
Google Scholar
15 Kieffer J. C., A simple proof of the Moy-Perez generalization of the Shannon-McMillan theorem, Pacific Journal of Mathematics. (1974) 51, 203–206, MR0347448, ZBL0281.94007.
10.2140/pjm.1974.51.203
Web of Science® Google Scholar
16 Kieffer J. C., A counterexample to Perez′s generalization of the Shannon-McMillan theorem, The Annals of Probability. (1973) 1, no. 2, 362–364, MR0351626, ZBL0262.94017.
10.1214/aop/1176996994
Web of Science® Google Scholar
17 McMillan B., The basic theorems of information theory, The Annals of Mathematical Statistics. (1953) 24, no. 2, 196–219, MR0055621, https://doi.org/10.1214/aoms/1177729028, ZBL0050.35501.
10.1214/aoms/1177729028
Web of Science® Google Scholar
18 Liu W. and Yang W., An extension of Shannon-McMillan theorem and some limit properties for nonhomogeneous Markov chains, Stochastic Processes and Their Applications. (1996) 61, no. 1, 129–145, MR1378852, https://doi.org/10.1016/0304-4149(95)00068-2, ZBL0861.60042.
10.1016/0304-4149(95)00068-2
Web of Science® Google Scholar
19 Liu W. and Yang W., The Markov approximation of the sequences of N-valued random variables and a class of small deviation theorems, Stochastic Processes and Their Applications. (2000) 89, no. 1, 117–130, MR1775230, https://doi.org/10.1016/S0304-4149(00)00016-8, ZBL1051.94005.
10.1016/S0304-4149(00)00016-8
Web of Science® Google Scholar
20 Gray R. M., Entropy and Information Theory, 1990, Springer, New York, NY, USA, MR1070359, ZBL0722.94001.
10.1007/978-1-4757-3982-4
Google Scholar

Citing Literature

All articles

A Strong Limit Theorem for Functions of Continuous Random Variables and an Extension of the Shannon-McMillan Theorem

Abstract

1. Introduction

2. Main Results

3. An Extension of the Shannon-McMillan Theorem

Acknowledgments

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

A Strong Limit Theorem for Functions of Continuous Random Variables and an Extension of the Shannon-McMillan Theorem

Abstract

1. Introduction

2. Main Results

3. An Extension of the Shannon-McMillan Theorem

Acknowledgments

References

Citing Literature

References

Related

Information