Volume 2015, Issue 1 287450
Research Article
Open Access

Large Deviation Analysis of a Droplet Model Having a Poisson Equilibrium Distribution

Richard S. Ellis

Corresponding Author

Richard S. Ellis

Department of Mathematics and Statistics, University of Massachusetts, Amherst, MA 01003, USA umass.edu

Search for more papers by this author
Shlomo Ta’asan

Shlomo Ta’asan

Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA cmu.edu

Search for more papers by this author
First published: 07 October 2015
Academic Editor: Lukasz Stettner

Abstract

In this paper we use large deviation theory to determine the equilibrium distribution of a basic droplet model that underlies a number of important models in material science and statistical mechanics. Given and c > b, K distinguishable particles are placed, each with equal probability 1/N, onto the N sites of a lattice, where K/N equals c. We focus on configurations for which each site is occupied by a minimum of b particles. The main result is the large deviation principle (LDP), in the limit K and N with K/N = c, for a sequence of random, number-density measures, which are the empirical measures of dependent random variables that count the droplet sizes. The rate function in the LDP is the relative entropy R(θρ), where θ is a possible asymptotic configuration of the number-density measures and ρ is a Poisson distribution with mean c, restricted to the set of positive integers n satisfying nb. This LDP implies that ρ is the equilibrium distribution of the number-density measures, which in turn implies that ρ is the equilibrium distribution of the random variables that count the droplet sizes.

1. Introduction

This paper is motivated by a natural question for a basic model of a droplet. Given and c > b, K distinguishable particles are placed, each with equal probability 1/N, onto the N sites of a lattice ΛN = {1,2, …, N}. Under the assumption that K/N = c and that each site is occupied by a minimum of b particles, what is the equilibrium distribution, as N, of the number of particles per site? We prove in Corollary 3 that this equilibrium distribution is a Poisson distribution, with mean c, restricted to the set of positive integers n satisfying nb. As we explain near the end of the Introduction, this equilibrium distribution has important applications to technologies using sprays and powders.

As in many other models in statistical mechanics, we can identify the equilibrium distribution by exhibiting it as the unique minimum point of a rate function in a large deviation principle (LDP). Other models for which this procedure can be implemented are discussed at the end of the Introduction.

For the droplet model we prove the LDP for a sequence of random probability measures, called number-density measures, which are the empirical measures of a sequence of dependent random variables that count the droplet sizes. This LDP is stated in Theorem 1. Our proof is self-contained and starts from first principles, using techniques that are familiar in applied mathematics and statistical mechanics. For example, the proof of the local large deviation estimate in Theorem 5, a key step in the proof of the LDP for the number-density measures, is based on combinatorics, Stirling’s formula, and Laplace asymptotics.

Our use of combinatorial methods goes back to Boltzmann in his work on the discrete ideal gas. He calculated the Maxwell-Boltzmann equilibrium distribution for this system by analyzing the asymptotic behavior of a particular multinomial coefficient [1]. Starting with Boltzmann’s work, combinatorial methods have remained an important tool in both statistical mechanics and in the theory of large deviations, offering insights into a wide variety of physical and mathematical phenomena via techniques that are elegant, powerful, and often elementary. In applications to statistical mechanics, this state of affairs is explained by the observation that “many fundamental questions … are inherently combinatorial, … including the Ising model, the Potts model, monomer-dimer systems, self-avoiding walks and percolation theory” [2]. For the two-dimensional Ising model and other exactly soluble models, [3, 4] are recommended.

A similar situation holds in the theory of large deviations. For example, Section 2.1 of [5] discusses combinatorial techniques for finite alphabets and points out that because of the concreteness of these applications the LDPs are proved under much weaker conditions than the corresponding results in the general theory, into which the finite-alphabet results give considerable insight. The text [6] devotes several early sections to large deviation results for i.i.d. random variables having a finite state space and proved by combinatorial methods, including a sophisticated, level-3 result for the empirical pair measure.

In order to formulate the LDP for the number-density measures in our droplet model, a standard probabilistic model is introduced. The configuration space is the set consisting of all ω = (ω1, ω2, …, ωK), where ωi denotes the site in ΛN occupied by the ith particle. The cardinality of ΩN equals NK. Denote by PN the uniform probability measure that assigns equal probability 1/NK to each of the NK configurations ωΩN. The asymptotic analysis of the droplet model involves the two random variables, which are functions of the configuration ωΩN: for , denotes the number of particles occupying the site in the configuration ω; for , Nj(ω) denotes the number of sites for which .

We focus on the subset of ΩN consisting of all configurations ω for which every site of ΛN is occupied by at least b particles. Because of this restriction Nj(ω) is indexed by . It is useful to think of each particle as having one unit of mass and of the set of particles at each site as defining a droplet. With this interpretation, for each configuration ω, denotes the mass or size of the droplet at site . The jth droplet class has Nj(ω) droplets and mass jNj(ω). Because the number of sites in ΛN equals N and the sum of the masses of all the droplet classes equals K, the following conservation laws hold for such configurations:
()
In addition, since the total number of particles is K, it follows that . These equality constraints show that the random variables Nj and are not independent.

In order to carry out the asymptotic analysis of the droplet model, we introduce a quantity m = m(N) that converges to sufficiently slowly with respect to N; specifically, we require that m(N) 2/N → 0 as N. In terms of b and m we define the subset ΩN,b,m of ΩN consisting of all configurations ω for which every site of ΛN is occupied by at least b particles and at most m of the quantities Nj(ω) are positive. This second condition is a useful technical device that allows us to control the errors in several estimates. In Appendix D of [7] we present evidence supporting the conjecture that this condition can be eliminated. The discussion in that appendix involves a number of interesting topics including Stirling numbers of the second kind (see [8, pp. 96-97] and [9, §5.4]) and their asymptotic behavior [10, Example 5.4].

The random quantities in the droplet model for which we formulate an LDP are the number-density measures ΘN,b. For ωΩN,b,m these random probability measures assign to the probability Nj(ω)/N, which is the number density of the jth droplet class. Because of the two conservation laws in (1) and because K/N = c, for ωΩN,b,m, ΘN,b(ω) is a probability measure on having mean c. Thus ΘN,b takes values in , which is defined to be the set of probability measures on having mean c.

The probability measure PN,b,m defining the droplet model is obtained by restricting the uniform measure PN to the set of configurations ΩN,b,m. Thus PN,b,m equals the conditional probability PN(·∣ΩN,b,m). In the language of statistical mechanics PN,b,m defines a microcanonical ensemble that incorporates the conservation laws for number and mass expressed in (1).

A natural question is to determine two equilibrium distributions: the equilibrium distribution ρ of the number-density measures and the equilibrium distribution of the droplet-size random variables . These distributions are defined by the following two limits: for any ε > 0, any , and all
()
where B(ρ, ε) denotes the open ball with center ρ and radius ε defined with respect to an appropriate metric on . As we prove, the equilibrium distributions of ΘN,b and coincide. As in many models in statistical mechanics, an efficient way to determine the equilibrium distribution ΘN,b is to prove an LDP for ΘN,b, which we carry out in Theorem 1. This theorem is the main result in the paper.

The content of Theorem 1 is the following: as N, the sequence of number-density measures ΘN,b satisfies the LDP on with respect to the measures PN,b,m. The rate function is the relative entropy R(θρb,α) of with respect to the Poisson distribution ρb,α on having components ρb,α;j = [Zb(α)] −1 · αj/j! for . In this formula Zb(α) is the normalization that makes ρb,α a probability measure, and α equals the unique value αb(c) for which has mean c [Theorem A.2]. Using the fact that equals 0 at the unique measure , we apply the LDP for ΘN,b to conclude in Theorem 2 that is the equilibrium distribution of ΘN,b. Corollary 3 then implies that is also the equilibrium distribution of .

The space is the most natural space on which to formulate the LDP for ΘN,b in Theorem 1. Not only is the smallest convex set of probability measures containing the range of ΘN,b for all , but also the union over of the range of ΘN,b is dense in . As we explain in part (a) of Theorem 4, is not a complete, separable metric space, a situation that prevents us from directly applying general results in the theory of large deviations that require the setting of a complete, separable metric space.

The droplet model is defined in Section 2. Step 1 in the proof of the LDP for ΘN,b is to derive the local large deviation estimate in part (b) of Theorem 5. This local estimate, one of the centerpieces of the paper, gives information not available in the LDP for ΘN,b, which involves global estimates. Step 2 is to lift the local large deviation estimate to the large deviation limit for ΘN,b lying in open balls and certain other subsets of while Step 3 is to lift the large deviation limit for open balls and certain other subsets to the LDP for ΘN,b stated in Theorem 1. Steps 2 and 3 are explained in Section 4.

Details of Steps 2 and 3 as well as other routine proofs are omitted from the present paper. They appear in the unpublished companion paper [7], which also contains additional background material. The paper [1] explores how our work on the droplet model was inspired by the work of Ludwig Boltzmann on a simple model of a discrete ideal gas. The main connection is via the local large deviation estimate in part (b) of Theorem 5. When b = 0, the LDP for a path version of Θn,0 with K = tN and t > 0 varying appears in [11, 12].

The main application of the results in this paper is to technologies using sprays and powders, which are ubiquitous in many fields, including agriculture, the chemical and pharmaceutical industries, consumer products, electronics, manufacturing, material science, medicine, mining, paper making, the steel industry, and waste treatment. In this paper we focus on sprays; our theory also applies to powders with only changes in terminology [13]. The behavior of sprays might be complex depending on various parameters including evaporation, temperature, and viscosity. Our goal here is to consider the simplest model where the only assumption is made on the average size of droplets in the spray. In many situations it is important to have good control over the sizes of the droplets, which can be translated into properties of probability distributions. The size distributions are important because they determine reliability and safety in each particular application.

Interestingly, there does not seem to be a rigorous theory that predicts the equilibrium distribution of droplet sizes, analogous to the Maxwell-Boltzmann distribution of energy levels in a discrete ideal gas [14, 15]. Our goal in the present paper is to provide such a theory. We do so by focusing on one aspect of the problem related to the relative entropy, an approach that characterizes the equilibrium distribution of droplet sizes as being a Poisson distribution restricted to . We expect that this distribution will be important in experimental observations. A full understanding of droplet behavior under dynamic conditions requires treating many other aspects and is beyond the scope of this paper. We plan to apply the ideas in this paper to understand the entropy of dislocation networks [16].

The importance of predicting droplet size can be seen from the wide range of applications utilizing sprays [17, 18]. Because of the importance of this problem, novel approaches for measuring size distribution of droplet size in sprays have been developed [1923]. What makes the problem of predicting droplet size particularly interesting is the complexity of droplet-size distribution, which is attributed to many factors such as temperature and viscosity. As [24] shows, even the nozzle plays a significant role in the outcome. Many theoretical tools used to understand the distribution of droplet size in sprays include entropy [25], which also plays a key role in the present paper.

We end the Introduction by expanding on a comment made at the beginning of this section. This comment concerns one of the main applications of large deviation theory in statistical mechanics, which is to identify the equilibrium distribution or distributions of a model as the minimum point(s) of the rate function in an LDP for the model. This procedure is also useful to study phase transitions in the model, which concern how the structure of the set of equilibrium distributions changes as the parameters defining the model change. There are numerous other models for which this procedure has been used. They include the following three lattice spin models: the Curie-Weiss spin system, the Curie-Weiss-Potts model, and the mean-field Blume-Capel model, which is also known as the mean-field BEG model. As explained in the respective Sections 6.6.1, 6.6.2, and 6.6.3 of [26], the large deviation analysis shows that each of these three models has a different phase transition structure. Details of the analysis for the three models are given in the references [6, §IV.4], [2729]. Section 9 of [30] outlines how large deviation theory can be applied to determine equilibrium structures in statistical models of two-dimensional turbulence. Details of this analysis are given in [31].

2. Definition of Droplet Model and Main Theorem

After defining the droplet model, we state the main theorem in the paper, Theorem 1. The content of this theorem is the LDP for the sequence of random, number-density measures, which are the empirical measures of a sequence of dependent random variables that count the droplet sizes in the model. As we show in Theorem 2 and in Corollary 3, the LDP enables us to identify a Poisson distribution as the equilibrium distribution both of the number-density measures and of the droplet-size random variables. In Theorem 4 we prove a number of properties of two spaces of probability measures in terms of which the LDP for the number-density measures is formulated.

We start by fixing parameters and c ∈ (b, ). The droplet model is defined by a probability measure PN,b parameterized by and the nonnegative integer b. The measure depends on two other positive integers, K and m, where 2 ≤ mN < K. Both K and m are functions of N in the large deviation limit N. In this limit we take K and N, where K/N, the average number of particles per site, equals c. Thus K = Nc. In addition, we take m sufficiently slowly by choosing m to be a function m(N) satisfying m(N) → and m(N) 2/N → 0 as N; for example, m(N) = Nδ for some δ ∈ (0,1/2). Throughout this paper we fix such a function m(N). The parameter b and the function m = m(N) first appear in the definition of the set of configurations ΩN,b,m in (3), where these quantities will be explained.

Because K and N are integers, c must be a rational number. This in turn imposes a restriction on the values of N and K. If c is a positive integer, then N along the positive integers and K along the subsequence K = cN. If c = x/y, where x and y are relatively prime, positive integers with y ≥ 2, then N along the subsequence N = yn for and K along the subsequence K = cN = xn. Throughout this paper, when we write or N, it is understood that N and K satisfy the restrictions discussed here.

In the droplet model K distinguishable particles are placed, each with equal probability 1/N, onto the sites of the lattice ΛN = {1,2, …, N}. This simple description corresponds to a simple probabilistic model. The configuration space is the set consisting of all sequences ω = (ω1, ω2, …, ωK), where ωiΛN denotes the site in ΛN occupied by the ith particle. Let ρ(N) be the measure on ΛN that assigns equal probability 1/N to each site in ΛN, and let PN = (ρ(N)) K be the product measure on ΩN with equal one-dimensional marginals ρ(N). Thus PN is the uniform probability measure that assigns equal probability 1/NK to each of the NK configurations ωΩN; for subsets A of ΩN we have PN(A) = card⁡(A)/NK, where card denotes cardinality.

The asymptotic analysis of the droplet model involves two random variables. For and ωΩN, denotes the number of particles occupying site in the configuration ω. For and ωΩN, Nj(ω) denotes the number of sites for which . The dependence of and Nj(ω) on N is not indicated in the notation. Because the distributions of both random variables depend on N, both and Nj form triangular arrays.

We now specify the role played by the nonnegative integer b, first focusing on the case where b is a positive integer. The case where b = 0 is discussed later. For ωΩN, in general there exist sites for which ; that is, sites that are occupied by 0 particles. The next step in the definition of the droplet model is to restrict to a subset ΩN,b,m of configurations ωΩN for which every site is occupied by at least b particles and the following constraint holds: for any configuration ωΩN,b,m at most m of the components Nj(ω) are positive, where m = m(N) → and m(N) 2/N → 0 as N. Because for ωΩN,b,m every site is occupied by at least b particles, we have and Nj(ω) is indexed by . We denote by N(ω) the sequence and define . In terms of this notation
()

The constraint restricting the number of positive components of N(ω) is a useful technical device that allows us to control the errors in several estimates. In Appendix D of [7] we give evidence supporting the conjecture that this restriction can be eliminated.

When b is a positive integer, for each ωΩN,b,m, each site in ΛN is occupied by at least b particles. In this case it is useful to think of each particle as having one unit of mass and of the set of particles at each site as defining a droplet. With this interpretation, for each configuration ω, denotes the mass or the size of the droplet at site . The jth droplet class has Nj(ω) droplets and mass jNj(ω). Because the number of sites in ΛN equals N and the sum of the masses of all the droplet classes equals K, it follows that the quantities Nj(ω) satisfy the two conservation laws in (1) for all ωΩN,b,m.

We now consider the modifications that must be made in these definitions when b = 0. In this case the first constraint in the definition of ΩN,b,m disappears because we allow sites to be occupied by 0 particles, and therefore Nj(ω) is indexed by . On the other hand, we retain the second constraint in the definition of ΩN,0,m, which requires that for any configuration ωΩN,0,m at most m of the components Nj(ω) for are positive. When b = 0, the definition of ΩN,0,m becomes ΩN,0,m = {ωΩN : |N(ω)|+m = m(N)}. Because the choice b = 0 allows sites to be empty, we lose the interpretation of the set of particles at each site as being a droplet. However, for ωΩN,0,m the two conservation laws in (1) continue to hold.

For the remainder of this paper we work with any fixed nonnegative integer b. The probability measure PN,b,m defining the droplet model is obtained by restricting the uniform measure PN to the set ΩN,b,m. Thus PN,b,m equals the conditional probability PN(·∣ΩN,b,m). For subsets A of ΩN,b,m, PN,b,m(A) takes the form
()
Having defined the droplet model, we introduce the random probability measures whose large deviations we will study. For ωΩN,b,m these measures are the number-density measures ΘN,b that assign to the probability Nj(ω)/N. This ratio represents the number density of droplet class j. Thus for any subset A of
()
By the two formulas in (1) and . Thus ΘN,b(ω) is a probability measure on having mean c.

We next introduce several spaces of probability measures that arise in the large deviation analysis of the droplet model. denotes the set of probability measures on . Thus has the form , where the components θj satisfy θj ≥ 0 and . We say that a sequence of measures in converges weakly to , and write θ(n)θ, if, for any bounded function f mapping into , as n. is topologized by the topology of weak convergence. There is a standard technique for introducing a metric structure on for which we quote the main facts. Because is a complete, separable metric space with metric d(x, y) = |xy|, there exists a metric π on called the Prohorov metric with the following two properties: (1) convergence with respect to the Prohorov metric is equivalent to weak convergence [32, Thm. 3.3.1]; (2) with respect to the Prohorov metric, is a complete, separable metric space [32, Thm. 3.1.7].

We denote by the set of measures in having mean c. Thus has the form , where the components θj satisfy θj ≥ 0, , and . The number-density measures ΘN,b defined in (5) take values in .

According to part (a) of Theorem 4, is not a closed subset of . Hence it is natural to introduce the closure of in . As we prove in part (b) of Theorem 4, the closure of in equals , which is the set of measures in having mean lying in the closed interval [b, c]. Being the closure of the relatively compact, separable metric space , is a compact, separable metric space with respect to the Prohorov metric. This space appears in the formulation of the large deviation upper bound in part (c) of Theorem 1.

We next state Theorem 1, which is the LDP for the sequence of distributions PN,b,m(ΘN,bdθ) on as N. The rate function in the LDP is the relative entropy of θ with respect to the Poisson distribution defined in (7), where each . Thus any is absolutely continuous with respect to . For the relative entropy of θ with respect to is defined by
()
If θj = 0, then . For the components of the measure appearing in the LDP have the form
()
where αb(c)∈(0, ) is chosen so that has mean c and Zb(αb(c)) is the normalization making a probability measure; thus and, for , . As we show in Theorem A.2, there exists a unique value of αb(c).

As a consequence of the fact that is not closed in , the large deviation upper bound takes two forms depending on whether the subset F of is compact or whether F is closed. When F is compact, in part (b) we obtain the standard large deviation upper bound for F. When F is closed, in part (c) we obtain a variation of the standard large deviation upper bound, which, when F is compact, coincides with the upper bound in part (b). The refinement in part (c) is important. It is applied in the proof of Theorem 2 to show that is the equilibrium distribution of the number-density measures ΘN,b. In turn, Theorem 2 is applied in the proof of Corollary 3 to show that is the equilibrium distribution of the droplet-size random variables .

In the next theorem we assume that m is the function m(N) appearing in the definition of ΩN,b,m in (3) and satisfying m(N) → and m(N) 2/N → 0 as N. The assumption that m(N) 2/N → 0 is used to control error terms in Lemmas 6 and 7 in the present paper and in Lemma B.3 in [7]. This assumption on m(N) is optimal in the sense that it is a minimal assumption guaranteeing that error terms in parts (a) and (b) of Lemma B.3 in [7] converge to 0. In the next theorem, for A a subset of or we denote by the infimum of over θA.

Theorem 1. Fix a nonnegative integer b and a rational number c ∈ (b, ). Let m be the function m(N) appearing in the definition of ΩN,b,m in (3) and satisfying m(N) → and m(N) 2/N → 0 as N. Let be the distribution having the components defined in (7). Then as N, with respect to the measures PN,b,m, the sequence ΘN,b satisfies the LDP on with rate function in the following sense.

  • (a)

    maps into [0, ] and has compact level sets in ; that is, for any M < the set is compact.

  • (b)

    For any compact subset F of we have the large deviation upper bound

    ()

  • (c)

    For any closed subset F of , let denote the closure of F in . We have the large deviation upper bound

    ()

  • (d)

    For any open subset G of we have the large deviation lower bound

    ()

The properties of in part (a) are proved in [33, Lem. 1.4.1] and part (a) of Theorem A.1. The basic step in proving the large deviation bounds in parts (b)–(d) is the local large deviation estimate in part (b) of Theorem 5. As explained in Section 4, this local estimate is lifted to large deviation limits involving open balls stated in Theorem 8, which in turn are used to derive the bounds in parts (b)–(d) of Theorem 1.

In the next theorem we use the large deviation upper bound in part (c) of Theorem 1 to prove that the Poisson distribution is the equilibrium distribution of the number-density measures ΘN,b. In this theorem denotes the complement in of the open ball . denotes the complement in of the open ball .

Theorem 2. One assumes the hypotheses of Theorem 1. The following results hold for any ε > 0.

  • (a)

    The quantity is strictly positive.

  • (b)

    For any number y in the interval (0, x) and all sufficiently large N

    ()

This upper bound implies that, as N, and for any bounded, continuous function g mapping into
()
These two limits allow us to interpret the Poisson distribution as the equilibrium distribution of the number-density measures ΘN,b with respect to PN,b,m.

Proof. The starting point is the large deviation upper bound in part (c) of Theorem 1 applied to the closed set , which is a subset of . We denote the closure of in by . Since , the large deviation upper bound in part (c) of Theorem 1 takes the form

()
We now prove part (a) of Theorem 2. Since is lower semicontinuous on and has compact level sets in [33, Lem. 1.4.3(b)–(c)], it attains its infimum x on the closed set . If x = 0, then there would exist such that . But on , attains its infimum of 0 at the unique measure [33, Lem. 1.4.1]. This contradicts the fact that , completing the proof of part (a). The inequality in part (b) is an immediate consequence of part (a) and the large deviation upper bound (13). This inequality yields the limit , which in turn implies (12). The proof of Theorem 2 is complete.

We now apply Theorem 2 to prove that is also the equilibrium distribution of the random variables , which count the droplet sizes at the sites of ΛN. This is the content of the next corollary. A fact needed in the proof is that ΘN,b is the empirical measure of these random variables; that is, for ωΩN,b,m, ΘN,b(ω) assigns to subsets A of the probability . This representation is valid because both ΘN,b(ω) and the empirical measure assign to jΛN the probability Nj(ω)/N.

Corollary 3. One assumes the hypotheses of Theorem 1. Then for any site and any

()

Proof. Since the random variables are identically distributed, it suffices to prove the corollary for . For fixed , the limit (12) with g(θ) = θj yields

()
This completes the proof.

The last theorem in this section proves several properties of and with respect to the Prohorov metric that are needed in the paper.

Theorem 4. Fix a nonnegative integer b and a real number c ∈ (b, ). The metric spaces and have the following properties.

  • (a)

    , the set of probability measures on having mean c, is a relatively compact, separable subset of . However, is not a closed subset of and thus is not a compact subset or a complete metric space.

  • (b)

    , the set of probability measures on having mean lying in the closed interval [b, c], is the closure of in . is a compact, separable subset of .

Proof. (a) For satisfying ξb let Ψξ denote the compact subset {b, b + 1, …, ξ} of , and let [Ψξ] c denote its complement. For any

()
It follows that is tight; that is, for any ε > 0 there exists such that θ([Ψξ] c) < ε for all . Prohorov’s theorem implies that is relatively compact [32, Thm. 3.2.2]. The separability of is proved in Corollary B.2 in [7].

We now prove that is not a closed subset of by exhibiting a sequence having a weak limit that does not lie in . Let θ be any measure in with mean β ∈ [b, c); thus . The sequence

()
has the property that and that . This completes the proof of part (a).

(b) Since is a separable subset of and is dense in , it follows that is separable. We prove that is the closure of in . Let θ(n) be a sequence in converging weakly to . Since θ(n)θ implies that for each , Fatou’s lemma implies that c = liminfn⁡〈θ(n)〉 ≥ 〈θ〉, where 〈θ(n)〉 and 〈θ〉 denote the means of θ(n) and θ. Since for any we have 〈θ〉≥b, it follows that c ≥ 〈θ〉≥b. This shows that the closure of in is a subset of . We next prove that is a subset of the closure of in by showing that for any there exists a sequence such that θ(n)θ. If 〈θ〉 = c, then we choose θ(n) = θ for all . If 〈θ〉 = β ∈ [b, c), then we use the sequence θ(n) in (17), which converges weakly to θ. We conclude that θ lies in the closure of and thus that is a subset of the closure of in . This completes the proof of part (b). The proof of Theorem 4 is done.

In the next section we present the local large deviation estimate that will be used in Section 4 to prove the LDP for ΘN,b in Theorem 1.

3. Local Large Deviation Estimate Yielding Theorem 1

The main result needed to prove the LDP in Theorem 1 is the local large deviation estimate stated in part (b) of Theorem 5. The first step is to introduce a set AN,b,m that plays a central role in this paper. Fix a nonnegative integer b and a rational number c ∈ (b, ). Given define K = Nc and let m be the function appearing in the definition of ΩN,b,m in (3) and satisfying m(N) → and m(N) 2/N → 0 as N. Define ; thus is the set of nonnegative integers. Let ν be a sequence for which each ; thus . We define AN,b,m to be the set of satisfying
()
where . Because , the two sums involve only finitely many terms.

For ωΩN,b,m the components ΘN,b;j(ω) of the number-density measure defined in (5) are Nj(ω)/N for , where Nj(ω) denotes the number of sites in ΛN containing j particles in the configuration ω. We denote by N(ω) the sequence . By definition, for every ωΩN,b,m each site is occupied by at least b particles, and |N(ω)|+m = m(N). It follows that AN,b,m is the range of N(ω) for ωΩN,b,m; the two sums involving νj in (18) correspond to the two sums involving Nj(ω) in (1).

Since the range of N(ω) is AN,b,m, for ωΩN,b,m the range of ΘN,b(ω) is the set of probability measures θN,b,ν whose components for have the form θN,b,ν;j = νj/N for νAN,b,m. By (18) θN,b,ν takes values in , the set of probability measures on having mean c. It follows that the set
()
is the range of ΘN,b(ω) for ωΩN,b,m.

In part (b) of the next theorem we state the local large deviation estimate for the event {ΘN,b = θN,b,ν}. In part (a) we introduce the Poisson distribution that appears in the local estimate; is defined in terms of a parameter αb(c) guaranteeing that it has mean c.

In part (a) of Theorem C.2 in [7] we give the straightforward proof of the existence of αb(c) for b = 1. The proof of the existence of αb(c) for general is much more subtle than the proof for b = 1. The proof for general is given in Theorem A.2 in the present paper.

Theorem 5. (a) Fix a nonnegative integer b and a real number c ∈ (b, ). For α ∈ (0, ) let ρb,α be the measure on having components ρb,α;j = [Zb(α)] −1 · αj/j! for , where Z0,α = eα, and, for , . Then there exists a unique value αb(c)∈(0, ) such that lies in the set of probability measures on having mean c. If b = 0, then α0(c) = c. If , then αb(c) is the unique solution in (0, ) of αZb−1(α)/Zb(α) = c.

(b) Fix a nonnegative integer b and a rational number c ∈ (b, ). Let m be the function m(N) appearing in the definition of  ΩN,b,m in (3) and satisfying m(N) → and m(N) 2/N → 0 as N. For any νAN,b,m we define to have the components θN,b,ν;j = νj/N for . Then

()
is finite because it involves only finitely many components of θN,b,ν, and εN(ν) → 0 uniformly for νAN,b,m as N.

We now prove the local large deviation estimate in part (b) of Theorem 5. This proof is based on a combinatorial argument that is reminiscent of and is as natural as the combinatorial argument used to prove Sanov’s theorem for empirical measures defined in terms of i.i.d. random variables having a finite state space [1, §3]. Part (b) of Theorem 5 is proved by analyzing the asymptotic behavior of the product of two multinomial coefficients that we now introduce.

Given νAN,b,m, our goal is to estimate the probability PN,b,m(ΘN,b = θN,b,ν), where θN,b,ν has the components θN,b,ν;j = νj/N for . A basic observation is that {ωΩN,b,m : ΘN,b(ω) = θN,b,ν} coincides with
()
It follows that
()
Our first task is to determine the asymptotic behavior of card⁡(ΔN,b,m;ν). In determining the asymptotic behavior of card⁡(ΩN,b,m), we will use the fact that ΩN,b,m can be written as the disjoint union
()
Let νAN,b,m be given. We start by expressing the cardinality of card⁡(ΔN,b,m;ν) as a product of two multinomial coefficients. For each configuration ωΔN,b,m;ν, K particles are distributed onto the N sites of the lattice ΛN with j particles going onto νj sites for . We carry this out in two stages. In stage one K particles are placed into N bins, νj of which have j particles for . The number of ways of making this placement equals the multinomial coefficient . This multinomial coefficient is well-defined since . Given this placement of K particles into N bins, the number of ways of moving the particles from the bins onto the sites 1,2, …, N of the lattice ΛN equals the multinomial coefficient . This second multinomial coefficient is well-defined since . We conclude that the cardinality of ΔN,b,m;ν is given by the product of these two multinomial coefficients:
()
Since |ν|+m, at most m of the components νj are positive. Such a product of multinomial coefficients is well known in combinatorial analysis [8, Thm. 2.10]. A related version of this formula is derived in Example III.23 of [34]. See also [35, p. 115] and formula (2) in [36, p. 36].

The next two steps in the proof of the local estimate given in part (b) of Theorem 5 are to prove the asymptotic formula for card⁡(ΔN,b,m;ν) in Lemma 6 and the asymptotic formula for card⁡(ΩN,b,m) in part (b) of Lemma 7. The proof of Lemma 6 is greatly simplified by a substitution in line 4 of (34). This substitution involves a parameter α ∈ (0, ), which, we emphasize, is arbitrary in this lemma. The substitution in line 4 of (34) allows us to express the asymptotic behavior of both card(ΔN,b,m;ν) in Lemma 6 and card(ΩN,b,m) in Lemma 7 directly in terms of the relative entropy R(θN,b,νρb,α), where ρb,α is the probability measure on having the components defined in part (a) of Theorem 5. One of the major issues in the proof of part (b) of Theorem 5 is to show that the arbitrary parameter α appearing in Lemmas 6 and 7 must take the value αb(c), which is the unique value of α guaranteeing that [Theorem 5(a)]. We show that α must equal αb(c) after the statement of Lemma 7.

Lemma 6. Fix a nonnegative integer b and a rational number c ∈ (b, ). Let α be any real number in (0, ), and let m be the function m(N) appearing in the definition of ΩN,b,m in (3) and satisfying m(N) → and m(N) 2/N → 0 as N. We define

()
For any νAN,b,m, we define to have the components θN,b,ν;j = νj/N for . Then
()
The quantity ζN(ν) → 0 uniformly for νAN,b,m as N.

Proof. The proof is based on a weak form of Stirling’s approximation, which states that, for all satisfying N ≥ 2 and for all satisfying 1 ≤ nN, 1 ≤ log⁡(n!) − (nlog⁡nn) ≤ 2log⁡N. We summarize the last formula by writing

()
The term denoted by O(log⁡N) satisfies 1 ≤ O(log⁡N) ≤ 2log⁡N.

To simplify the notation, we rewrite (24) in the form card⁡(ΔN,b,m;ν) = M1(N, ν) · M2(K, ν), where M1(N, ν) denotes the first multinomial coefficient on the right side of (24), and M2(K, ν) denotes the second multinomial coefficient on the right side of (24). We have

()

The asymptotic behavior of the first term on the right side of the last display is easily calculated. Since νAN,b,m, there are |ν|+ ∈ {1,2, …, m} positive components νj. Because of this restriction on the number |ν|+ of positive components of ν, we are able to control the error in line 3 of (29). We define . For each jΨN(ν), since the components νj satisfy 1 ≤ νjN, we have log⁡(νj!) = νjlog⁡νjνj + O(log⁡N) for all N ≥ 2. Using the fact that , we obtain

()
where as N and . By the inequality noted after (27) and the fact that |ν|+m
()
Since (mlog⁡N)/N → 0 as N, we conclude that uniformly for νAN,b,m as N.

We now study the asymptotic behavior of the second term on the right side of (28). Since K = Nc, we obtain for all K ≥ 2

()
where as N. The weak form of Stirling’s formula is used to rewrite the term log⁡(K!) in the last display, but not to rewrite the terms log⁡(j!), which we leave untouched.

Substituting (29) and (31) into (28), we obtain

()
In this formula . As N,
()
We conclude that ζN(ν) → 0 uniformly for νAN,b,m as N.

Now comes the key step, the purpose of which is to express the sum in the next-to-last line of (32) as the relative entropy R(θN,b,ν;jρb,α), where α ∈ (0, ) is arbitrary. To express the sum in the next-to-last line of (32) as R(θN,b,νρb,α), we rewrite the sum as shown in line 4 of the next display:

()
The facts that and are used to derive the next-to-last equality. The proof of Lemma 6 is complete.

The next step in the proof of the local large deviation estimate in part (b) of Theorem 5 is to prove the asymptotic formula for card⁡(ΩN,b,m) stated in part (b) of the next lemma. The proof of this lemma uses Lemma 6 in a fundamental way. After the statement of this lemma we show how to apply it and Lemma 6 to prove part (b) of Theorem 5.

Lemma 7. Fix a nonnegative integer b and a rational number c ∈ (b, ). The following conclusions hold:

  • (a)

    limNN−1log⁡card⁡(AN,b,m) = 0.

  • (b)

    Let α be the positive real number in Lemma 6, and let m be the function m(N) appearing in the definition of ΩN,b,m in (3) and satisfying m(N) → and m(N) 2/N → 0 as N. We define f(α, b, c, K) = log⁡Zb(α) − clog⁡α + clog⁡Kc. Then R(θρb,α) attains its infimum over , and

    ()

The quantity ηN → 0 as N.

Before proving Lemma 7, we derive the local large deviation estimate in part (b) of Theorem 5 by applying Lemmas 6 and 7. An integral part of the proof is to show how the arbitrary value of α ∈ (0, ) appearing in these lemmas is replaced by the specific value αb(c) appearing in Theorem 5. As in the statement of part (b) of Theorem 5, let ν be any vector in AN,b,m and define to have the components θN,b,ν;j = νj/N for . By (22)
()
Substituting the asymptotic formula for log⁡card(ΔN,b,m;ν) derived in Lemma 6 and the asymptotic formula for log⁡card(ΩN,b,m) given in part (b) of Lemma 7 yields
()
The error term εN(ν) equals ζN(ν) − ηN; ζN(ν) is the error term in Lemma 6, and ηN is the error term in Lemma 7. As N, ζN(ν) → 0 uniformly for νAN,b,m, and ηN → 0. It follows that εN(ν) → 0 uniformly for νAN,b,m as N.
We now consider the first two terms on the right side of (37). By part (b) of Theorem A.1 applied to , for any α ∈ (0, )
()
With this step we have succeeded in replacing the relative entropy R(θN,b,νρb,α) with respect to ρb,α, which appears in Lemma 6, by the relative entropy with respect to , which appears in Theorem 5. Substituting the last equation into (37) gives
()
where εN(ν) → 0 uniformly for νAN,b,m as N. This is the conclusion of part (b) of Theorem 5.

We now complete the proof of part (b) of Theorem 5 by proving Lemma 7.

Proof of Lemma 7. (a) We write . By [8, Cor. 2.5] the number of elements in the set indexed by k equals the binomial coefficient C(N − 1,   k − 1). Since by assumption m/N → 0 as N, for all sufficiently large N, the quantities C(N − 1, k − 1) are increasing and are maximal when k = m. Since C(N − 1, k − 1) ≤ C(N, k), it follows that

()
An application of the weak form of Stirling’s formula yields for all m ≥ 2 and all Nm + 2
()
Since m/N → 0 as N, we conclude that 0 ≤ N−1log⁡card⁡(AN,b,m) → 0 as N. This completes the proof of part (a).

(b) The starting point is (23), which states that . For distinct νAN,b,m the sets ΔN,b,m;ν are disjoint. Hence

()
where
()
It follows from part (a) that δN → 0 as N.

We continue with the estimation of card⁡(ΩN,b,m). By Lemma 6

()
As proved in Lemma 6, as N. Hence by (42)
()
Under the assumption that R(·∣ρb,α) attains its infimum over , we define
()
In the last two paragraphs of this proof, we show that ηN → 0 as N. Given this fact, the last equation yields the asymptotic formula (35) in part (b).

We now prove that ηN → 0 as N. To do this, we use (45) to write

()
Like the second and third terms on the right side, the first term on the right side is nonnegative because AN,b,m is a subset of . Since and δN → 0 as N, it will follow that ηN → 0 if we can show that R(·∣ρb,α) attains its infimum over and that
()

We now prove (48). R(·∣ρb,α) is lower semicontinuous on [33, Lem. 1.4.3(b)] and thus on . Since R(·∣ρb,α) has compact level sets in [Theorem A.1(a)], it attains its infimum over at some measure θ. We apply Theorem B.1 in [7] to θ = θ, obtaining a sequence θ(N) with the following properties: (1) for , θ(N)BN,b,m has components for , where ν(N) is an appropriate sequence in AN,b,m; (2)  θ(N)θ as N; (3)  R(θ(N)ρb,α) → R(θρb,α) as N. The limit in (48) follows from the inequalities

()
and the limit as N. This completes the proof of Lemma 7 and thus the proof of the local estimate in part (b) of Theorem 5.

In the next section we explain how the local large deviation estimate in part (b) of Theorem 5 yields the LDP in Theorem 1.

4. Proof of Theorem 1 from Part (b) of Theorem 5

In Theorem 1 we state the LDP for the sequence ΘN,b of number-density measures. This sequence takes values in , which is the set of probability measures on having mean c ∈ (b, ). The purpose of the present section is to explain how the local large deviation estimate in part (b) of Theorem 5 yields the LDP for ΘN,b. All details appear in Section 4 of [7]. The basic idea is first to prove the large deviation limit for ΘN,b lying in open balls in and in other subsets defined in terms of open balls and then to use this large deviation limit to prove the LDP in Theorem 1.

In Theorem 8 we state the large deviation limit for open balls and other subsets defined in terms of open balls. Two types of open balls are considered. Let θ be a measure in , and take r > 0. Part (a) states the large deviation limit for open balls , where π denotes the Prohorov metric on . This limit is used to prove the large deviation upper bound for compact subsets of in part (b) of Theorem 1 and the large deviation lower bound for open subsets of in part (d) of Theorem 1. Now let θ be a measure in . Part (b) states the large deviation limit for sets of the form , where . This limit is used to prove the large deviation upper bound for closed subsets in part (c) of Theorem 1. If , then , and the conclusions of parts (a) and (b) of the next theorem coincide.

Theorem 8. Fix a nonnegative integer b and a rational number c ∈ (b, ). Let m be the function m(N) appearing in the definitions of ΩN,b,m in (3) and satisfying m(N) → and m(N) 2/N → 0 as N. The following conclusions hold:

  • (a)

    Let θ be a measure in and take r > 0. Then for any open ball Bπ(θ, r) in , is finite, and one has the large deviation limit

    ()

  • (b)

    Let θ be a measure in and take r > 0. Then the set is nonempty, is finite, and one has the large deviation limit

    ()

We prove Theorem 8 by applying the local large deviation estimate in part (b) of Theorem 5. A key step is to approximate probability measures in Bπ(θ, ε) and in by appropriate sequences of probability measures in the range of ΘN,b. This procedure allows one to show in part (a) that the infimum can be approximated by the infimum of over θ lying in the intersection of Bπ(θ, ε) and the range of ΘN,b; a similar statement holds for the infimum in part (b). A set of hypotheses that allow one to carry out this approximation procedure is given in Theorem  4.2 in [7], a general formulation that yields Theorem 8 as a special case.

Theorem 1 states the LDP for the number-density measures ΘN,b. In order to complete the proof of Theorem 1, we must lift the large deviation limits in Theorem 8 to the large deviation upper bound for compact sets and for closed sets and the large deviation lower bound for open sets. The large deviation lower bound for open sets is immediate from the limit in part (a). To prove the large deviation upper bound for compact sets, we cover the compact set by open balls and use the limit in part (a); the large deviation upper bound for closed sets follows by a similar procedure involving part (b). The details of this procedure are carried out as an application of general formulation in Theorem  4.3 in [7].

In the Appendix we prove two properties of the relative entropy and prove the existence of the quantity αb(c) appearing in part (a) of Theorem 5.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The research of Shlomo Ta’asan is supported in part by a grant from the National Science Foundation (NSF-DMS-1216433). Richard S. Ellis thanks Jonathan Machta for sharing his insights into statistical mechanics and Michael Sullivan for his generous help with a number of topological issues arising in this paper. Both authors thank the referee for a careful reading of the paper and for suggesting a number of references.

    Appendix

    Properties of Relative Entropy and Existence of αb(c)

    We fix a nonnegative integer b and a real number c ∈ (b, ). Given θ a probability measure on , the mean of θ is denoted by 〈θ〉. In Theorem A.1 we present two properties of the relative entropy R(θρb,α) and for θ in each of the following three spaces, which are introduced in Section 2: , the set of probability measures on ; , the set of satisfying 〈θ〉 = c; and , the set of satisfying 〈θ〉∈[b, c].

    We recall that, for α ∈ (0, ), ρb,α denotes the Poisson distribution on having components ρb,α;j = [Zb(α)] −1 · αj/j! for , where Z0(α) = eα, and, for , . According to part (a) of Theorem 5 there exists a unique value α = αb(c) for which ; thus lies in . In Theorem A.2 we prove the existence of αb(c). In part (a) of the next theorem we show that R(θρb,α) has compact level sets in , , and . After the statement of Lemma 7 we use part (b) of the next theorem to show that the arbitrary parameter α in Lemmas 6 and 7 must have the value αb(c).

    Theorem A.1. Fix a nonnegative integer b and a real number c ∈ (b, ). For any α ∈ (0, ) the relative entropy has the following properties:

    • (a)

      R(·∣ρb,α) has compact level sets in , , and .

    • (b)

      For any , .

    Proof. (a) The fact that has compact level sets in is proved in part (c) of Lemma  1.4.3 in [33]. Since is a compact subset of [Theorem 4(d)], R(·∣ρb,α) also has compact level sets in . Because is not a closed subset of [Theorem 4(a)], the proof that R(·∣ρb,α) has compact level sets in is more subtle. If θ(n) is any sequence in satisfying R(θ(n)ρb,α) ≤ M < , then since and R(·∣ρb,α) has compact level sets in , there exist and a subsequence such that and R(θρb,α) ≤ M. To complete the proof that R(·∣ρb,α) has compact level sets in , we must show that ; that is, 〈θ〉 = c. By Fatou’s lemma . In addition, for any w ∈ (0, )

    ()
    Lemma  5.1 in [37] shows that the sequence is uniformly integrable, implying that [32, Appendix, Prop. 2.3]. This completes the proof that R(·∣ρb,α) has compact level sets in . The proof of part (a) is finished.

    (b) We define g(α, b, c) = log⁡Zb(α) − clog⁡α − (log⁡Zb(αb(c)) − clog⁡αb(c)). Step 1 is to prove that for any

    ()
    For any we have and . Hence
    ()
    Since the last two lines equal , the proof of (A.2) is complete. Step 2 is to prove that R(θρb,α) attains its infimum over at the measure , and
    ()
    Given these two assertions part (b) of the theorem follows by substituting into (A.2).

    We now prove the two assertions in Step 2. R(·∣ρb,α) is lower semicontinuous on [33, Lem. 1.4.3(b)] and thus on . Since R(·∣ρb,α) has compact level sets in , it attains its infimum over . The relative entropy attains its minimum value of 0 over at the unique measure [33, Lem. 1.4.1]. Hence (A.2) implies that the minimum value of R(·∣ρb,α) over equals

    ()
    The last equality follows by applying (A.2) with . This display shows that R(·∣ρb,α) attains its infimum over at and yields (A.4). The proof of part (b) is finished, completing the proof of the theorem.

    We now prove that there exists a unique value of αb(c) for which . The conclusion of the next theorem is part (a) of Theorem C.1 in [7]. In part (b) of that theorem we derive two sets of bounds on αb(c) and use these bounds to show that αb(c) is asymptotic to c as c. In part (d) of Theorem C.1 in [7] we make precise the relationship between and a Poisson random variable having parameter αb(c).

    Theorem A.2. Fix a nonnegative integer b and a real number c ∈ (b, ). There exists a unique value αb(c)∈(0, ) such that lies in the set of probability measures on having mean c. If b = 0, then α0(c) = c. If , then αb(c) is the unique solution in (0, ) of αZb−1(α)/Zb(α) = c.

    According to this theorem, for , αb(c) is the unique solution of αZb−1(α)/Zb(α) = c. The heart of the proof of Theorem A.2, and its most subtle step, is to prove that the function γb(α) = αZb−1(α)/Zb(α) satisfies for α ∈ (0, ) and thus is monotonically increasing on this interval. This fact is proved in the next lemma.

    Lemma A.3. Fix a positive integer b and a real number c ∈ (b, ). For α ∈ (0, ) the function γb(α) = αZb−1(α)/Zb(α) satisfies .

    Proof. For and for α ∈ (0, ), we have . Thus . The key to proving that is to represent log⁡Zb(α) in terms of the moment generating function of a probability measure. We do this by first expressing Zb(α) in terms of the upper incomplete gamma function via the formula . As suggested in [38], we now make the change of variables x = yα, obtaining the representation

    ()
    The function gb is the moment generating function of the probability measure on having the density hb(y) = b(−y)b−1 on [−1,0]. For α ∈ (0, ) let σb,α be the probability measure on having the density eαyhb(y)/gb(α) on [−1,0]. A straightforward calculation shows that
    ()
    It follows that for all α ∈ (0, ).

    Using (A.6) and the formulas and , we calculate

    ()
    This completes the proof of the lemma.

    We are now ready to prove Theorem A.2.

    Proof of Theorem A.2. We first consider b = 0. In this case ρ0,α is a standard Poisson distribution on having mean α. It follows that α0(c) = c is the unique value for which has mean c and thus lies in . This completes the proof for b = 0.

    We now consider . In this case ρb,α is a probability measure on having mean

    ()
    Thus ρb,α has mean c if and only if α satisfies γb(α) = c, where γb(α) = αZb−1(α)/Zb(α). We prove the theorem by showing that γb(α) = c has a unique solution αb(c)∈(0, ) for all and any c > b. This assertion is a consequence of the following three steps: ; (2)  limαγb(α) = ; (3) for all α ∈ (0, ), . Steps 1 and 2 follow immediately from the definition of γb(α), and Step 3 is proved in Lemma A.3.

    We have proved the theorem for all . Since we also validated the conclusion of the theorem for b = 0, the proof for all nonnegative integers b is done.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.