Volume 2013, Issue 1 174802

Research Article

Open Access

Stability Analysis of Learning Algorithms for Ontology Similarity Computation

Wei Gao,

Corresponding Author

Wei Gao

[email protected]

School of Information and Technology, Yunnan Normal University, Kunming, Yunnan 650500, China ynnu.edu.cn

Search for more papers by this author

Tianwei Xu,

Tianwei Xu

School of Information and Technology, Yunnan Normal University, Kunming, Yunnan 650500, China ynnu.edu.cn

Key Laboratory of Educational Informatization for Nationalities, Ministry of Education, Yunnan Normal University, Kunming 650500, China ynnu.edu.cn

Search for more papers by this author

Wei Gao,

Corresponding Author

Wei Gao

[email protected]

School of Information and Technology, Yunnan Normal University, Kunming, Yunnan 650500, China ynnu.edu.cn

Search for more papers by this author

Tianwei Xu,

Tianwei Xu

School of Information and Technology, Yunnan Normal University, Kunming, Yunnan 650500, China ynnu.edu.cn

Key Laboratory of Educational Informatization for Nationalities, Ministry of Education, Yunnan Normal University, Kunming 650500, China ynnu.edu.cn

Search for more papers by this author

First published: 04 June 2013

https://doi.org/10.1155/2013/174802

Citations: 15

Academic Editor: Ding-Xuan Zhou

Share a link

Email
Wechat
Bluesky

Abstract

Ontology, as a useful tool, is widely applied in lots of areas such as social science, computer science, and medical science. Ontology concept similarity calculation is the key part of the algorithms in these applications. A recent approach is to make use of similarity between vertices on ontology graphs. It is, instead of pairwise computations, based on a function that maps the vertex set of an ontology graph to real numbers. In order to obtain this, the ranking learning problem plays an important and essential role, especially k-partite ranking algorithm, which is suitable for solving some ontology problems. A ranking function is usually used to map the vertices of an ontology graph to numbers and assign ranks of the vertices through their scores. Through studying a training sample, such a function can be learned. It contains a subset of vertices of the ontology graph. A good ranking function means small ranking mistakes and good stability. For ranking algorithms, which are in a well-stable state, we study generalization bounds via some concepts of algorithmic stability. We also find that kernel-based ranking algorithms stated as regularization schemes in reproducing kernel Hilbert spaces satisfy stability conditions and have great generalization abilities.

1. Introduction and Motivations

The study of ontology deals with questions concerning what entities exist and how such entities can be grouped, related within a hierarchy, and subdivided according to similarities and differences. The developed tools have been widely applied in medicine, biology, and social science. In computer science, ontology is defined as a model for sharing formal concepts and has been applied in intelligent information integration, cooperative information systems, information retrieval, electronic commerce, and knowledge management. After a decade’s development, ontology technology has matured as an effective model of hierarchical structure and semantics for concepts, supported by systematic and comprehensive engineering theory, representation, and construction tools.

Ontology similarity computation is an essential part in practical applications. In information retrieval, it has been used to compute semantic similarity and search for concepts. We take a graph-theory approach and represent an ontology by a weighted graph G = (V, E, w). In this setting, V = {v₁, …, v_n} is the (finite) set of vertices corresponding to concepts or objects of the ontology, E ⊂ V × V is a set of edges, and w : E → ℝ₊ is a weight function. For two vertices v_i and v_j representing two concepts, the weight w(v_i, v_j) measures their similarity in the ontology.

Example 1. In some applications of ontology similarity computation, the weight function w takes values on [0,1]. Then the case w(v_i, v_j) = 1 means that v_i and v_j represent the same concept while w(v_i, v_j) = 0 means that these two concepts have no similarity. In information retrieval, with a threshold parameter 0 < ϵ < 1, when one tries to find related information of the concept v_i, all concepts v_j satisfying w(v_i, v_j) > ϵ are returned, which means that v_j and v_i have a high similarity.

Traditional methods for ontology similarity computation are based on pairwise similarity calculation. Their computational complexity is high, and they required selection of many parameters, which are not so intuitive. In this paper, we use a learning theory approach. The idea is to learn a scoring function f : V → ℝ and then to determine the similarity between vertices (concepts) v_i and v_j by their value difference |f(v_i) − f(v_j)|: the smaller the difference the higher the similarity. Formally, if and only if . Such an inspiring approach was introduced from the viewpoint of ranking in [1] where a ranking algorithm is used for learning from samples a scoring function f with small ranking error. The method was employed in the ontology setting in [2] which demonstrates accuracy and efficiency. Another possible way to learn such a function f is by a graph Laplacian and taking an eigenvector associated with its second smallest eigenvalue. See [3–6] for details. This method requires a positive definiteness condition for a similarity matrix which is hard to check in our setting. Also, when the size of the graph is large, the computational complexity is high.

In this paper, we explore the learning theory approach for ontology similarity computations in a setting when the ontology graph is a tree. It is a connected graph without cycle. Thus, there is a unique path between any two vertices. The tree structure gives restrictions on similarity of vertices (concepts). For example, we assign a top vertex v_top and let it be the root, then denote k the degree (the number of edges that link to a vertex) of the top vertex. Let N_G(v_top) = {v₁, v₂, …, v_k} be the neighbor set of v_top. If there is a path from one vertex to v_top through v_i, then it belongs to branch i. Thus, we have k branches in the tree and any two vertices belonging to different branches have no edge between them. The concepts in the same branch of the tree should have higher similarity, compared with concepts in different branches. This observation motivates us to apply the k-partite ranking algorithm [7] in which the k parts correspond to the k classes of vertices of k rates. The rate values of all classes are decided by experts. Intuitively, a vertex of a higher rate b is ranked higher than any vertex of rate a if 1 ≤ a < b ≤ k. Thus, the k-partite ranking algorithm is reasonable to learn a similarity function for an ontology graph with a tree structure.

The main contribution of this paper is to state some ontology computations as a k-partite ranking problem and to conduct stability analysis of the algorithms with mild conditions, which leads to useful error bounds for ontology applications.

The organization of the rest part of this paper is as follows. The setting and main results are given in the next section. The generalization bounds for learning algorithms will be shown in Section 4. The stability and generalization bounds for the learning algorithms stated as regularization schemes in reproducing kernel Hilbert spaces will be discussed in Section 5.

2. Formal Setting and Main Results

Now, we state our learning algorithm for ontology similarity computation.

Let V = {v₁, …, v_n} be the finite set of vertices of an ontology graph. It is divided into k disjoint subsets corresponding to k rates. Let 𝒟 be a probability measure on V.

The performance of a ranking function f : V → ℝ can be measured by the following concept.

Definition 2. A ranking loss function is a function l : ℝ^V × V × V → ℝ₊ ∪ {0} that assigns, for f : V → ℝ and v, v^′ ∈ V, a nonnegative real number l(f, v, v^′) interpreted as the loss of f in its relative ranking of v and v^′. The expected k-partite ranking error on the ontology graph for a ranking function f : V → ℝ associated with the ranking loss function l is defined as

()

where 𝒟_a is the conditional distribution of 𝒟 on V_a.

Example 3. One commonly used ranking loss function is the hinge ranking loss defined as

()

where x₊ = max {x, 0}. Another ranking loss function is the γ-ranking loss with a smoothing parameter γ > 0 defined as

()

Learning algorithms are implemented with a sample of size M, called a preference graph, which is assumed here to be independently drawn according to 𝒟. It can also be divided into k parts {𝒯₁, …, 𝒯_k}, where 𝒯_a = {t_i : t_i ∈ V_a} consists of those sampling points of rate a.

In [8], Agarwal and Niyogi have studied the algorithmic stability in a general setting, where the training examples take labels y ∈ [0, M₁] for some M₁ > 0. A goal ranking function ranks future instances with larger labels higher than those with smaller labels. Here, our setting is more specific. The learner is given a preference graph 𝒯 consisting of k disjoint parts corresponding to the k classes of vertices. Every part has a rate value. The target ranking function ranks future instances in higher-rate parts higher than those in lower-rate parts.

A large class of learning algorithms is generated by regularization schemes. They penalize an empirical error which is chosen here to be the empirical k-partite ranking error on the ontology graph defined for a f : V → ℝ associated with the sample 𝒯 as

()

In this paper, we study a learning algorithm generated by a regularization scheme in a reproducing kernel Hilbert space (RKHS) (ℋ_K, ∥·∥_K) associated with a Mercer kernel K : V × V → ℝ. Now, the regularization scheme is defined by

()

where λ > 0 is a regularization parameter. On the selection of the regularized parameter, readers are referred to [9, 10] for more details about the method of cross-validation.

One point we need to emphasize that we abuse terminology for the sake of better readability. If the ranking function f does not associate with RKHS (for instance, in Lemma 9, Theorems 10 and 12), then the second term in the right-hand side of (5) vanishes.

Our error analysis provides a learning rate of algorithm (5) when the ranking loss is σ-admissible.

Definition 4. Let l be a ranking loss, σ > 0, and ℱ a class of real-valued functions on V. We say that l is σ-admissible with respect to ℱ if for any f₁, f₂ ∈ ℱ and v, v^′ ∈ V,

()

Let us state the estimate of learning rates which will be proved in Section 5.

Theorem 5. Let ℋ_K be a RKHS such that K(v, v) ≤ κ² < ∞ for all v ∈ V. Let l be a ranking loss, σ-admissible with respect to ℋ_K, and bounded by some B > 0, such that l(f, v, v^′) is convex with respect to f. Let be a fixed function in ℋ_K satisfying

()

for some M^′ > 0. Then, for any 0 < δ < 1, with confidence at least 1 − δ, one has

()

Form Theorem 5, we see that if λ → 0 and (e.g., λ = M^−1/4), then er_l(f_𝒯) converges with confidence to . The quantity is well understood in the literature related to learning theory (e.g., [11–14]).

3. Stability Analysis

An algorithm is stable if any change of a single point in a training set yields only a small change in the output. It is natural to consider that a good ranking algorithm is one with good stability; that is, a mild change of samples does not necessarily lead to too much change in the ranking function. Some analysis of the stability of ranking algorithms is given in [1, 8, 15].

Let n_a = |𝒯_a| and be the i_ath element in 𝒯_a for 1 ≤ a ≤ k, 1 ≤ i_a ≤ n_a. Let be the sequence obtained by replacing in 𝒯 by a new sampling point t^a of rate a. We define some notions of stability for k-partite ranking algorithms.

Definition 6 (uniform loss stability for a k-partite ranking algorithm on ontology graph). Let A be a k-partite ranking algorithm for ontology whose output on a preference graph 𝒯 = (𝒯₁, …, 𝒯_k) is denoted by f_𝒯. Let l be a ranking loss function and α_a : ℕ^k → ℝ for 1 ≤ a ≤ k. We say that A has uniform loss stability (α₁, …, α_k) with respect to l if for all 1 ≤ a ≤ k, n_a ∈ ℕ, and i_a ∈ {1, …, n_a}, we have for all t^a ∈ V_a, v, v^′ ∈ V but belong to different rate,

()

Definition 7 (uniform score stability for a k-partite ranking algorithm on ontology graph). Let A be a k-partite ranking algorithm for ontology whose output on a preference graph 𝒯 = (𝒯₁, …, 𝒯_k) is denoted by f_𝒯. Let l be a ranking loss function and μ_a : ℕ^k → ℝ for 1 ≤ a ≤ k. We say that A has uniform score stability (μ₁, …, μ_k) with respect to l if for all 1 ≤ a ≤ k, n_a ∈ ℕ, , i_a ∈ {1, …, n_a}, and t^a ∈ V_a, and for all v ∈ V,

()

The main tool used here is McDiarmid’s inequality, which bounds the deviation of any function of a sample on which a single change in the sample has limited effect.

Theorem 8 (see [16].)Let X₁, …, X_N be independent random variables, each taking values in a set A. Let ϕ : A^N → ℝ such that for each i ∈ {1, …, N}, there exists a constant c_i > 0 such that

()

Then, for any ɛ > 0,

()

In what follows, denotes a training sample set obtained by replacing in 𝒯 by by t^k for , and i_a ∈ {1, …, n_a}. Also, α_a(n₁, …, n_k) and μ_a(n₁, …, n_k) are simply denoted by α_a and μ_a, respectively. We only consider the case of sample replacements with the same rate: for some ontology graphs, the graph structure is fixed; hence the members of vertices and edges in each branch are fixed.

4. Generalization Bounds for Stable k-Partite Ranking Algorithms on Ontology Graph

From this section, our analysis for stability of k-partite ranking algorithms is stated on an ontology graph and our organization follows [8]. In this section, generalization bounds for ranking algorithms that exhibit good stability properties will be derived. Our tricks are based on those of [17]. We start with the following technical lemma.

Lemma 9. Let A be a symmetric k-partite ranking algorithm for ontology whose output on a preference graph 𝒯 = (𝒯₁, …, 𝒯_k) is denoted by f_𝒯, and let l be a ranking loss function. Then, for all , i_a ∈ {1, …, n_a}, and t^a ∈ V_a, t^b ∈ V_b, one has

()

Proof. We have

()

By symmetry, the term in the summation is the same for all i, j. Therefore, we get

()

Interchanging the roles of t_i with t^a and t_j with t^b, we get

()

Since

()

the results follow.

We are now ready to give our main result of this section, which bounds the expected l-error of a ranking function learned by a k-partite ranking algorithm with good uniform loss stability in terms of its empirical l-error on the training sample. The proof follows [18].

Theorem 10. Let A be a symmetric k-partite ranking algorithm for ontology whose output on a preference graph 𝒯 = (𝒯₁, …, 𝒯_k) is denoted by f_𝒯, and let l be ranking loss function such that 0 ≤ l(f, v, v^′) ≤ B for all f : V → ℝ and v, v^′ ∈ 𝒯. Let α_a : ℕ^k → ℝ for 1 ≤ a ≤ k such that A has uniform loss stability (α₁, …, α_k) with respect to l. Let α_max = max {α₁, …, α_k}. Then, for any 0 < δ < 1, with confidence at least 1 − δ, one has

()

Proof. Let ϕ : V^M → ℝ be defined by

()

We show that ϕ satisfies the condition for McDiarmid’s inequality. To this end, let t^a ∈ V_a. For each i₁ ∈ {1, …, n₁}, we have

()

These give

()

Similarly, it can be shown that for any i_a ∈ {1, …, n_a}, 1 ≤ a ≤ k,

()

Thus, applying McDiarmid’s inequality to ϕ, we get for any ɛ > 0,

()

Now, by Lemma 9, we know that the expectation

can be bounded as

()

Thus, for any ɛ > 0,

()

The result follows by setting the right-hand side equal to δ and solving it for ɛ.

For any γ > 0, and any k-partite ranking algorithm with good uniform loss stability with respect to l_γ, Theorem 10 can be applied to bound the expected ranking error of a learned ranking function in terms of its empirical l_γ-error on the training sample. The following lemma shows that, for every γ > 0, a ranking algorithm with good uniform score stability also has good uniform loss stability with respect to l_γ. Using the techniques of Lemma 2 in [17], and taking τ(v_i, v_j) in Example 3 as 1, the following lemma can be obtained immediately.

Lemma 11. Let A be a k-partite ranking algorithm for ontology whose output on a preference graph 𝒯 = (𝒯₁, …, 𝒯_k) is denoted by f_𝒯. Let μ_a : ℕ^k → ℝ for 1 ≤ a ≤ k such that A has uniform score stability (μ₁, …, μ_k). Then, for every γ > 0, A has uniform loss stability with respect to the γ ranking loss l_γ, where for all n₁, …, n_k ∈ ℕ,

()

Combining Theorem 10 and Lemma 11, we get the following result which bounds the expected ranking error of a learned ranking function in terms of its empirical l_γ-error for any ranking algorithm with good uniform score stability.

Theorem 12. Let A be a k-partite ranking algorithm for ontology whose output on a preference graph 𝒯 = (𝒯₁, …, 𝒯_k) is denoted by f_𝒯. Let μ_a : ℕ^k → ℝ for 1 ≤ a ≤ k such that A has uniform score stability (μ₁, …, μ_k), and γ > 0. Denote μ_max = max {μ₁, …, μ_k}. If l is a ranking loss satisfying 0 ≤ l(f, v, v^′) ≤ B for all f : V → ℝ and v, v^′ ∈ 𝒯, then for any 0 < δ < 1, with probability of at least 1 − δ,

()

Proof. One applies Theorem 10 to A with the ranking loss l_γ (using Lemma 11), which satisfies 0 ≤ l_γ ≤ B. One finishes the proof thanks to the fact that

5. Stable Ranking Algorithms

In this section, we will demonstrate stability of some ranking algorithms in which a ranking function is selected by minimizing a regularized objective function. A general result for regularization-based k-partite ranking algorithms will be derived in Section 5.1. In Section 5.2, this result is used to illustrate stability of kernel-based k-partite ranking algorithms that perform regularization in a reproducing kernel Hilbert space. These stability results are also used to achieve consistency theorem for kernel-based k-partite ranking algorithms in Section 5.3.

5.1. General Regularizers

Let l be given a ranking loss function, ℱ a class of real-valued functions on V, and N : ℱ → ℝ₊ ∪ {0} a regularization functional. Consider the following regularized empirical l-error of a ranking function f ∈ ℱ (with respect to a preference graph 𝒯) with regularization parameter λ > 0,

()

We consider k-partite ranking algorithms that minimize such a regularized objective function; that is, ranking algorithms that, given a preference graph 𝒯, output a ranking function f_𝒯 ∈ ℱ that satisfies

()

for some fixed choice of ranking loss l, function class ℱ, regularized N, and regularization parameter λ. We derive a general result below that will be useful for showing stability of such regularization-based algorithms.

Lemma 13. Let l be a ranking loss such that l(f, v, v^′) is convex in f. Let ℱ be a convex set of real-valued functions on V, and let σ > 0 such that l is σ-admissible with respect to ℱ. Let λ > 0, and let N : ℱ → ℝ₊ ∪ {0} be a functional defined on ℱ such that for preference graph 𝒯 = (𝒯₁, …, 𝒯_k), the regularized empirical l-error has a minimum (not necessarily unique) in ℱ. Let A be a k-partite ranking algorithm for ontology defined by (29). Let t^a ∈ V_a, , and i_a ∈ {1, …, n_a}. For brevity, denote

()

and let

()

Then for any t ∈ [0,1] and q = 1, …, k, one has

()

Proof. Recall that a convex function g satisfies

()

Since l(f, v, v^′) is convex in f,

is convex in f. Therefore, for any t ∈ [0,1], we have

()

and also (interchanging the roles of f and

()

Adding the above two inequalities yields

()

Now, since ℱ is convex,

and

. Since f minimizes

in ℱ and

minimizes

in ℱ, we have

()

Adding these two inequalities and applying (36), we get

()

Similarly, for q = 2, …, k, we have

()

The results follow.

As we will see below, the above result can be used to establish stability of some regularization-based ranking algorithms.

5.2. Regularization in Reproducing Kernel Hilbert Spaces

Let ℱ be a reproducing kernel Hilbert space (RKHS) of real-valued functions on V associated with a Mercer kernel K : V × V → ℝ. Here, K_v : V → ℝ is defined as K_v(v^′) = K(v, v^′), and the reproducing property of ℱ gives that for all f ∈ ℱ and all v ∈ V,

()

where 〈·,·〉_K denotes the RKHS inner product in ℱ. By the Schwartz inequality, it is easy to show that for all f ∈ ℱ and all v ∈ V,

()

where ∥·∥_K denotes the RKHS norm in ℱ. We consider ranking algorithms that perform regularization in the RKHS ℱ using the squared norm in ℱ as regularizers. Specifically, let N : ℱ → ℝ₊ ∪ {0} be the regularizer defined by

()

It will be demonstrated below that if for some 0 ≤ κ < ∞, K(v, v) ≤ κ for any v ∈ V, then a ranking algorithm that minimizes an appropriate regularized error over ℱ, with regularizer N defined as above, has good uniform score stability.

Theorem 14. Let ℱ be an RKHS with kernel K such that for all v ∈ V, K(v, v) ≤ κ² < ∞. Let l be a ranking loss such that l(f, v, v^′) is convex in f and l is σ-admissible with respect to ℱ. Let λ > 0, and let N be given by (42). Let A be the k-partite ranking algorithm for ontology that, given a preference graph 𝒯, outputs a ranking function f_𝒯 ∈ ℱ defined by (29). Then, A has uniform score stability (μ₁, …, μ_k) with

()

Proof. Let v^a ∈ V_a and i_a ∈ {1, …, n_a}.

Applying Lemma 13 with t = 1/2, we get (using the notation in the proof of Lemma 13) that

()

Since ℱ is a vector space,

, and

, so

and

are well defined. It is easy to check that

()

Combined with (44), this gives

()

Since (as noted above)

, this together with (41) gives

()

It follows that

()

This together with (41) tells us that for any v ∈ V,

()

Similarly, we can also obtain

()

The conclusion follows.

Theorems 12 and 14 give the following generalization bound for kernel-based ranking algorithms.

Corollary 15. Under the conditions of Theorem 14, one has that for any 0 < δ < 1, with probability of at least 1 − δ over the draw of 𝒯, the expected ranking error of the ranking function f_𝒯 learned by the regularized algorithm associated with the l₁ ranking loss is bounded by

()

The result of Corollary 15 shows that a larger regularization parameter λ leads to better stability and, therefore, a tighter confidence interval in the resulting generalization bound.

Under the conditions of the above results, a kernel-based ranking algorithm minimizing a regularized empirical l-error also has good uniform loss stability with respect to l; this follows from the following simple lemma.

Lemma 16. Let ℱ be a class of real-valued functions on V, and let A be a k-partite ranking algorithm for ontology that, given a preference graph 𝒯, outputs a ranking function f_𝒯 ∈ ℱ. If A has uniform score stability (μ₁, …, μ_k) and l is a ranking loss that is σ-admissible with respect to ℱ, then A has uniform loss stability (α₁, …, α_k) with respect to l, where for all m ∈ ℕ,

()

The proof of this result can follow the proof of Lemma 13 in [8]. Using Theorem 14 and Lemma 16, we can immediately get the following corollary.

Corollary 17. Under the conditions of Theorem 14, A has uniform loss stability (α₁, …, α_k) with respect to l, where for all n_a ∈ ℕ,

()

5.3. Consistency

We can also use the above results to show consistency of kernel-based ranking algorithms. In particular, let

denote the optimal expected l-error in an RKHS ℱ (for a given distribution):

()

Then, for a bounded loss function l, we can show that with an appropriate choice of the regularization parameter λ, the expected l-error er_l(f_𝒯) of the ranking function f_𝒯 learned by a kernel-based ranking algorithm that minimizes a regularized empirical l-error in ℱ converges (in probability) to this optimal value. We first show the following simple lemma.

Lemma 18. Let f : V → ℝ be a fixed ranking function, and let l be a bounded ranking loss function such that 0 ≤ l(f, v, v^′) ≤ B for all f : V → ℝ and v, v^′ ∈ V. Then, for any 0 < δ < 1, with probability of at least 1 − δ,

()

Proof. Define ϕ as

()

Then, E_𝒯[ϕ(𝒯)] = er_l(f). We show that ϕ satisfies the condition of McDiarmid’s inequality. For each i_a ∈ {1, …, n_a} and v^a ∈ V_a, we have

()

Therefore, applying McDiarmid’s inequality, we know that for any ɛ > 0,

()

The result follows by setting the right-hand side equal to δ and solving it for ɛ.

We are now in a position to prove our main result (Theorem 5).

Proof of Theorem 5. We use Corollary 17 and apply Theorem 10 with δ/2 to get that with probability of at least 1 − δ/2,

()

Clearly,

()

Applying Lemma 18 to

with δ/2, we, thus, get that with probability of at least 1 − δ/2:

()

One finishes the proof by combining the inequalities in (59) and (61), each of which holds with probability at least 1 − δ/2, together with the condition in (7).

6. Conclusion

The main focus of this paper is on studying the stability and generalization properties of k-partite ranking algorithm used for ontology computation. This algorithm shows good intuition about the vertex in ontology graph mapping to a vertex in a line. The representation of vertices in ontology graph does not take real-valued labels, and the samples are given by preference graph (pairwise vertices in different ranking rates). This setting is suitable for ontology. We have derived generalization bounds for k-partite ranking algorithms in this setting using the notion of algorithmic stability. It is also shown that k-partite ranking algorithms revealing good stability properties have good generalization properties. Our results are applied to obtain generalization bounds for kernel-based k-partite ranking algorithms that perform regularization in a reproducing kernel Hilbert space.

Acknowledgments

The authors thank the reviewers for their constructive comments and detailed suggestions for improving the quality of this paper. This work was supported in part by the Key Laboratory of Educational Informatization for Nationalities, Ministry of Education, the National Natural Science Foundation of China (60903131) and Key Science and Technology Research Project of Education Ministry (210210).

References

1 Agarwal S., Ranking on graph data, Proceedings of the 23rd International Conference on Machine Learning (ICML ′06), June 2006, 25–32, 2-s2.0-33749266045.
Google Scholar
2 Wang Y. Y., Gao W., Zhang Y. G., and Gao Y., Ontology similarity computation using ranking learning method, Proceedings of the IEEE International Conference Computational Intelligence and Industrial Application, 2010, 20–23.
Google Scholar
3 Belkin M. and Niyogi P., Laplacian eigenmaps for dimensionality reduction and data representation, Neural Computation. (2003) 15, no. 6, 1373–1396, 2-s2.0-0042378381, https://doi.org/10.1162/089976603321780317.
10.1162/089976603321780317
Web of Science® Google Scholar
4 Gao W. and Zhou D. X., Convergence of spectral clustering with a general similarity function, Science China Mathematics. (2012) 42, no. 10, 985–994.
Google Scholar
5 von Luxburg U., Belkin M., and Bousquet O., Consistency of spectral clustering, The Annals of Statistics. (2008) 36, no. 2, 555–586, https://doi.org/10.1214/009053607000000640, MR2396807, ZBL1133.62045.
10.1214/009053607000000640
Web of Science® Google Scholar
6 Smale S. and Zhou D.-X., Geometry on probability spaces, Constructive Approximation. (2009) 30, no. 3, 311–323, https://doi.org/10.1007/s00365-009-9070-2, MR2558684, ZBL1187.68270.
10.1007/s00365-009-9070-2
Web of Science® Google Scholar
7 Rajaram S. and Agarwal S., Generalization bounds for k-partite ranking, Proceedings of the NIPS Workshop on Learning to Rank, 2005, British Columbia, Canada, 28–33.
Google Scholar
8 Agarwal S. and Niyogi P., Generalization bounds for ranking algorithms via algorithmic stability, Journal of Machine Learning Research. (2009) 10, 441–474, MR2485989, ZBL1235.68123.
Web of Science® Google Scholar
9 Caponnetto A. and Yao Y., Cross-validation based adaptation for regularization operators in learning theory, Analysis and Applications. (2010) 8, no. 2, 161–183, https://doi.org/10.1142/S0219530510001564, MR2646909, ZBL1209.68405.
10.1142/S0219530510001564
Web of Science® Google Scholar
10 Golub G. H., Heath M., and Wahba G., Generalized cross-validation as a method for choosing a good ridge parameter, Technometrics. (1979) 21, no. 2, 215–223, https://doi.org/10.2307/1268518, MR533250, ZBL0461.62059.
10.1080/00401706.1979.10489751
Web of Science® Google Scholar
11 Hu T., Online regression with varying Gaussians and non-identical distributions, Analysis and Applications. (2011) 9, no. 4, 395–408, https://doi.org/10.1142/S0219530511001923, MR2852061, ZBL1253.68189.
10.1142/S0219530511001923
Web of Science® Google Scholar
12 Smale S. and Zhou D.-X., Estimating the approximation error in learning theory, Analysis and Applications. (2003) 1, no. 1, 17–41, https://doi.org/10.1142/S0219530503000089, MR1959283, ZBL1079.68089.
10.1142/S0219530503000089
Web of Science® Google Scholar
13 Wang H.-Y., Xiao Q.-W., and Zhou D.-X., An approximation theory approach to learning with ℓ¹ regularization, Journal of Approximation Theory. (2013) 167, 240–258, https://doi.org/10.1016/j.jat.2012.12.004, MR3010046.
10.1016/j.jat.2012.12.004
Web of Science® Google Scholar
14 Yang D. C. and Yang D. Y., Real-variable characterizations of Hardy spaces associated with Bessel operators, Analysis and Applications. (2011) 9, no. 3, 345–368, https://doi.org/10.1142/S021953051100187X, MR2823879, ZBL1227.42021.
10.1142/S021953051100187X
Web of Science® Google Scholar
15 Agarwal S., Learning to rank on graphs, Machine Learning. (2010) 81, no. 3, 333–357, 2-s2.0-78049528331, https://doi.org/10.1007/s10994-010-5185-8.
10.1007/s10994-010-5185-8
Web of Science® Google Scholar
16 McDiarmid C., On the method of bounded differences, Surveys in Combinatorics, 1989, 141, Cambridge University Press, 148–188, MR1036755, ZBL0712.05012.
Web of Science® Google Scholar
17 Agarwal S. and Niyogi P., Stability and generalization of bipartite ranking algorithms, 3559, Proceedings of the 18th Annual Conference on Learning Theory, 2005, 32–47, Lecture Notes in Computer Science.
Google Scholar
18 Bousquet O. and Elisseeff A., Stability and generalization, Journal of Machine Learning Research. (2002) 2, no. 3, 499–526, https://doi.org/10.1162/153244302760200704, MR1929416, ZBL1007.68083.
10.1162/153244302760200704
Web of Science® Google Scholar

All articles

Stability Analysis of Learning Algorithms for Ontology Similarity Computation

Abstract

1. Introduction and Motivations

2. Formal Setting and Main Results

3. Stability Analysis

4. Generalization Bounds for Stable k-Partite Ranking Algorithms on Ontology Graph

5. Stable Ranking Algorithms

5.1. General Regularizers

5.2. Regularization in Reproducing Kernel Hilbert Spaces

5.3. Consistency

6. Conclusion

Acknowledgments

References

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Stability Analysis of Learning Algorithms for Ontology Similarity Computation

Abstract

1. Introduction and Motivations

2. Formal Setting and Main Results

3. Stability Analysis

4. Generalization Bounds for Stable k-Partite Ranking Algorithms on Ontology Graph

5. Stable Ranking Algorithms

5.1. General Regularizers

5.2. Regularization in Reproducing Kernel Hilbert Spaces

5.3. Consistency

6. Conclusion

Acknowledgments

References

References

Related

Information