A Natural Diffusion Distance and Equivalence of Local Convergence and Local Equicontinuity for a General Symmetric Diffusion Semigroup
Abstract
In this paper, we consider a general symmetric diffusion semigroup on a topological space X with a positive σ-finite measure, given, for t > 0, by an integral kernel operator: Ttf(x)≜∫X ρt(x, y)f(y)dy. As one of the contributions of our paper, we define a diffusion distance whose specification follows naturally from imposing a reasonable Lipschitz condition on diffused versions of arbitrary bounded functions. We next show that the mild assumption we make, that balls of positive radius have positive measure, is equivalent to a similar, and an even milder looking, geometric demand. In the main part of the paper, we establish that local convergence of Ttf to f is equivalent to local equicontinuity (in t) of the family . As a corollary of our main result, we show that, for t0 > 0, converges locally to , as t converges to 0+. In the Appendix, we show that for very general metrics on X, not necessarily arising from diffusion, , as t → 0+. R. Coifman and W. Leeb have assumed a quantitative version of this convergence, uniformly in x, in their recent work introducing a family of multiscale diffusion distances and establishing quantitative results about the equivalence of a bounded function f being Lipschitz, and the rate of convergence of Ttf to f, as t → 0+. We do not make such an assumption in the present work.
1. Introduction
Diffusion semigroups play an important role in analysis, both theoretical and applied. Diffusion semigroups include the heat semigroup and, more generally, as discussed in, e.g., [1], arise from considering large classes of elliptic second-order (partial) differential operators on domains in Euclidean space or on manifolds. For examples of theoretical results involving diffusion semigroups, the interested reader may refer to Sturm [2] and Wu [3]. Some recent applications of diffusion semigroups to dimensionality reduction, data representation, multiscale analysis of complex structures, and the definition and efficient computation of natural diffusion distances can be found in, e.g., [4–11].
A particular important issue in harmonic analysis is to connect the smoothness of a function with the speed of convergence of its diffused version to itself, in the limit as time goes to zero. For the Euclidean setting, see, for example, [12, 13]. In order to consider the smoothness of diffusing functions in more general settings, a distance defined in terms of the diffusion itself seems particularly appropriate.
Defining diffusion distances is of interest in applications as well. As discussed in [5], dimensionality reduction of data and the concomitant issue of finding structures in data are highly important objectives in the fields of information theory, statistics, machine learning, sampling theory, etc. It is often useful to organize the given data as nodes in a weighted graph, where the weights reflect local interaction between data points. Random walks, or diffusion, on graphs may then help understand the interactions among the data points at increasing distance scales. To even consider different distance scales, it is necessary to define an appropriate diffusion distance on the constructed data graph.
In the present paper, we introduce a new family of diffusion distances generated by the diffusion semigroup . We provide several reasons as to why we think our definition is natural; in particular, we show that, for a convolution diffusion kernel on , we achieve α = 1 in the discussion just above; i.e., we can recover (local) Euclidean distance to the “full” power 1.
The implication established in [7, 11] that smoothness of f implies control of the speed of convergence of Ttf to f seems to us to be a more notable result than the converse (which the authors establish without assuming the decay of (1)). However, if f is Lipschitz for the multiscale diffusion distance introduced in [7, 11], as the authors themselves point out their assumed estimate (2) almost tautologically leads to the desired estimate for the speed of convergence of Ttf to f.
The main reason for our current work is that we wish to avoid making any assumptions about the decay of (1) and still establish a correspondence between some version of smoothness of a function f and convergence of Ttf to f, as t → 0+. Our main contribution is to establish, under almost no assumptions, that local equicontinuity (in t) is equivalent to local convergence; i.e., local control of the differences Ttf(x) − Ttf(y) for all t small is equivalent to local control of the differences Ttf(x) − f(x) for all small t. Here “local” is defined relative to a representative of our family of proposed diffusion distances.
In Section 4, we make the assumption that balls of positive radius with respect to the distance Dg have positive measure. We show there is an equivalent topology, which does not depend on the function g, for which a corresponding statement about positive measure is equivalent to our assumption. The latter requirement, in turn, seems to be a mild and reasonable one.
In the main section, Section 5, we define our version of local convergence of Tt(f) to f, as well as local equicontinuity of the family . Both definitions use our distance Dg. We then establish that local convergence is equivalent to local equicontinuity. We next prove a corollary which extends an a.e. convergence result of Stein in [1]: for t0 > 0, converges locally to , as t converges to 0+.
2. Notation and Assumptions
We define T0 to be the identity map. Note that, for all t, , by Fubini’s theorem, that clearly , and hence , for 1 ≤ p ≤ ∞, by interpolation.
To avoid degeneracy, e.g., each Tt being the averaging operator on a space of finite mass, we make an additional assumption: Tt(f) → f in L2, as t → 0+.
- (i)
T0 is the identity
- (ii)
Tt+s = Tt∘Ts, for all s, t ≥ 0
- (iii)
, for 1 ≤ p ≤ ∞
- (iv)
Tt is a self-adjoint operator on L2(X)
- (v)
Tt(f) → f in L2, as t → 0+
- (vi)
Tt(f) ≥ 0 if f ≥ 0
- (vii)
Tt(1) = 1
See Stein’s book [1], in which the author derives various harmonic analysis results for symmetric diffusion semigroups without explicitly using kernels.
3. A Natural Diffusion Distance
We now define our diffusion distance.
Definition 1. For a bounded, nonnegative, increasing function g on (0,1], with , and g strictly positive on the interval (0,1], define the distance Dg by
It is clear that the distance Dg satisfies the triangle inequality. Note that the restriction that g is bounded in the above supremum has the effect of making all “large” distances comparable to a constant, but this is not a drawback for smoothness considerations.
Proposition 2. For |x − y| ≤ 1, Dg(x, y) ~ |x − y|α/β if α ≤ β, and Dg(x, y) ~ |x − y| if α ≥ β. For |x − y| > 1, Dg(x, y) ~ 1.
Proof. Using the notation for the special case above, we need to estimate sup0<t≤1tαh(t−β(y − x)).
Let us first consider the situation when |x − y| > 1. Then, for 0 < t ≤ 1, t−β|y − x| ≥ 1, so sup0<t≤1tαh(t−β(y − x)) ~ 1 using the estimate for h mentioned before the proposition.
Next, consider the situation when |x − y| ≤ 1. Let t0 = |x − y|1/β. Note that 0 < t0 ≤ 1.
When t0 ≤ t ≤ 1, we have that t−β|x − y| ≤ 1, so
When 0 < t ≤ t0, we have that t−β|x − y| ≥ 1, so
Combining the above discussions for the two ranges of values of t, the result follows.
Thus, for this special case of , g(t) = tα, and ρt(x, y) = t−nβϕ(t−β(x − y)), which includes both the heat kernel and the Poisson kernel, our definition of diffusion distance gives (local) Euclidean or sub-Euclidean distance (depending on the relative sizes of α and β). This result seems appropriate.
4. A Geometric Assumption about the Measure on X
We then have the following equivalence of topologies induced by the sets B(x0, ϵ) and N(x0, t, ϵ):
Proposition 3. For any x0 ∈ X and any ϵ > 0, there exist t > 0 and δ > 0 such that N(x0, t, δ)⊆B(x0, ϵ). Conversely, for any x0 ∈ X, t > 0, and ϵ > 0, there exists a δ > 0 such that B(x0, δ)⊆N(x0, t, ϵ).
Proof. Fix an x0 ∈ X and an ϵ > 0. We first show that there exist t > 0 and δ > 0 such that N(x0, t, δ)⊆B(x0, ϵ).
Since we made the assumption that for the function g used in defining the distance Dg, there exists a 0 < t < 1 with g(t) < ϵ/4. Let δ = ϵ/(2M), where M = sup0<s≤1g(s) = g(1). Now, pick an arbitrary x ∈ N(x0, t, δ).
For 0 < s ≤ t, since g in increasing, we see that
Now consider the case when t ≤ s ≤ 1. Note that, by definition of N(x0, t, δ), we have that . Then, for this range of s, we observe that
We conclude (see (8)) that
For the converse, fix x0 ∈ X, t > 0 and ϵ > 0. We will show that there exists a δ > 0 such that B(x0, δ)⊆N(x0, t, ϵ).
Since, for any x, is decreasing in s (see (11)), we clearly have that N(x0, s1, ϵ)⊆N(x0, s2, ϵ) for any 0 < s1 < s2. Thus, we may assume 0 < t < 1. Let δ = ϵg(t). Then, for any x ∈ B(x0, δ), we have that Dg(x, x0) < ϵg(t). Hence, using Definition 1 of the distance Dg, we obtain
Returning to our assumption that, for any x0 ∈ X and any ϵ > 0, B(x0, ϵ) has positive measure, Proposition 3 shows that it is equivalent to require the following: for any x0 ∈ X, t > 0, and ϵ > 0, the set N(x0, t, ϵ) has positive measure. Note that the definition of the sets N(x0, t, ϵ) is more “universal” than that of the balls B(x0, ϵ), since the former do not involve the function g.
The assumption that, for any x0 ∈ X, t > 0, and ϵ > 0, the set N(x0, t, ϵ) has positive measure appears to us to be a very natural, and mild, one. In words, this requirement is saying that, for any time t > 0 and any ϵ > 0, the set of points in our space X which have not diffused more than ϵ away (in the L1 sense) from the diffused point x0, at time t, is not “thin” with respect to the underlying measure on X. This assumption seems reasonable in both the discrete case (each point has positive mass, and x = x0 is “enough”) and the continuous case (every point x0 has “many” arbitrarily close points in the sense of diffusion).
5. Local Convergence Is Equivalent to Local Equicontinuity
In this section, we define local convergence and local equicontinuity for our situation and show that the two concepts are equivalent under our assumptions.
In what follows, Tt is a symmetric diffusion operator as defined in Section 2.
Definition 4. Let f ∈ Lp, 1 ≤ p ≤ ∞. Note that f is actually an equivalence class of functions on the space X. Suppose there exists a particular representative of this equivalence class, which we will also call f, such that this representative f is defined at every point of X, and for every ϵ > 0, there exist t0 > 0 and δ > 0 so that |Ttf(x) − f(x)| < ϵ, for all t with 0 < t ≤ t0 and all x ∈ B(x0, δ). We then say Ttf converges to f locally at x0.
We also make the following.
Definition 5. Let f ∈ Lp, 1 ≤ p ≤ ∞. Suppose there exists a particular representative of the equivalence class specified by f and which we will also call f, such that this representative f is defined at every point of X, and for every ϵ > 0, there exist t0 > 0 and δ > 0 with the property that, for all x ∈ B(x0, δ), we have |f(x) − f(x0)| < ϵ and for all t with 0 < t ≤ t0, |Ttf(x) − Ttf(x0)| < ϵ. We then say the family is locally equicontinuous (in t) at x0.
Our main result is the following.
Proposition 6. For f ∈ L2∩L∞ and any x0 ∈ X, the following are equivalent:
- (i)
Ttf converges to (the representative) f locally at x0
- (ii)
The family is locally equicontinuous at x0
Moreover, if a representative f satisfies one of these statements, the same representative satisfies the other statement.
Proof. We first show that local convergence at x0 implies local equicontinuity at x0. We thus begin by assuming that Ttf converges to a representative f locally at x0.
First, we establish continuity of this representative f at x0. Fix ϵ > 0. By the assumption, there exist 1 ≥ t0 > 0 and δ > 0 such that |Ttf(x) − f(x)| < ϵ/3, for all t with 0 < t ≤ t0 and all x ∈ B(x0, δ). Then, for any x ∈ B(x0, δ), using the definition of the distance Dg, we see that
Next, note that
Conversely, we now show that local equicontinuity at x0 implies local convergence at x0. We thus begin by assuming that the family is equicontinuous at x0.
Fix ϵ > 0. By the assumption, there exist 1 ≥ t0 > 0 and δ > 0 such that, for the representative f, |f(x) − f(x0)| < ϵ/5 and |Ttf(x) − Ttf(x0)| < ϵ/5, for all x ∈ B(x0, δ) and all t with 0 < t ≤ t0. In Section 4, we made the assumption that all balls of positive radius have positive measure. Using Stein’s Maximal Theorem (see Chapter III, §3 in [1]), a.e. So there is a y0 ∈ B(x0, δ) such that . Now, for x ∈ B(x0, δ),
We estimate the first term on the right hand side of the above inequality as follows:
Thus, for all t with 0 < t ≤ min(t0, t1), and for any x ∈ B(x0, δ), we obtain that |Ttf(x) − f(x)| < ϵ, which concludes the proof of the converse.
In the proof above, we used Stein’s Maximal Theorem (see Chapter III, §3 in [1]) to state that a.e. Stein’s a.e. convergence result, for f ∈ L2 say, is the main place in our paper where the symmetry of the operators Tt is needed: Stein requires symmetry to prove his Maximal Theorem.
We immediately have the following.
Corollary 7. Let f ∈ L2∩L∞. Fix t0 > 0. Then for any x0 ∈ X, converges locally to at x0.
Proof. By Proposition 6, it suffices to show that is locally equicontinuous at x0. Fix ϵ > 0. Let G(t) = g(t) for 0 < t ≤ 1 and G(t) = g(1) for t > 1. For any t ≥ 0, we have that
Using our notation, Stein in [1] mentions that for almost all x, since he proves that Ttf is a real-analytic function of t > 0 for almost all x. Corollary 7 extends Stein’s result (under our assumption discussed in Section 4) to show local convergence with respect to the distance Dg.
6. Conclusions and Future Work
In this paper, we have defined a diffusion distance which is natural if one imposes a reasonable Lipschitz condition on diffused versions of arbitrary bounded functions. We have next shown that the mild assumption that balls of positive radius have positive measure is equivalent to a similar, and an even milder looking, geometric demand. In the main part of the paper, we establish that local convergence of Ttf to (a representative) f at a point is equivalent to local equicontinuity of the family at that point.
We plan to continue exploring for which (diffusion) distances the convergence in (32) holds and an estimate can be obtained.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
We are both grateful and indebted to Raphy Coifman for his continued willingness to discuss mathematics with us. The first author was partially supported by Faculty Development Funding from Ramapo College of New Jersey.
Appendix
Proposition 8. Let be a metric on X with the following properties:
- (1)
- (2)
X is separable with respect to the metric , i.e., it contains a countable dense subset
- (3)
There exists a δ > 0 so that m[B(x, δ)] < ∞ for every x ∈ X (the bound need not be uniform in x). Here, m[B(x, δ)] denotes the measure of the ball
Then,
To prove the proposition, we first establish the following.
Lemma 9. For any x0 ∈ X, if r > 0 is such that m[B(x0, r)] < ∞, then
Proof. Let , where χ(y) is the characteristic function of the ball B(x0, r). Since m[B(x0, r)] < ∞, we see that f ∈ L2(X). Using Stein’s Maximal Theorem (see Chapter III, §3 in [1]), we conclude that
To this end, we apply Stein’s Maximal Theorem to the L2 function χ(y) to see that there is a set D⊆B(x0, r), with m[B(x0, r)∖D] = 0, so that
Since we assumed that , we obtain that, for every x ∈ D,
Combining (A.4) and (A.8), we conclude that, for x ∈ C∩D⊆B(x0, r),
Open Research
Data Availability
No data were used to support this study.