Volume 2009, Issue 1 921574
Research Article
Open Access

Multiple Hypotheses LAO Testing for Many Independent Objects

Evgueni Haroutunian

Corresponding Author

Evgueni Haroutunian

Institute for Informatics and Automation Problems, Armenian National Academy of Sciences, 1 P. Sevak street, 0014 Yerevan, Armenia sci.am

Search for more papers by this author
Parandzem Hakobyan

Parandzem Hakobyan

Institute for Informatics and Automation Problems, Armenian National Academy of Sciences, 1 P. Sevak street, 0014 Yerevan, Armenia sci.am

Search for more papers by this author
First published: 17 September 2009

Abstract

The procedure of many hypotheses logarithmically asymptotically optimal (LAO) testing for a model consisting of three or more independent objects is analyzed. It is supposed that M probability distributions are known and each object follows one of them independently of others. The matrix of asymptotic interdependencies (reliability-reliability functions) of all possible pairs of the error probability exponents (reliabilities) in optimal testing for this model is studied. This problem was introduced (and solved for the case of two objects and two given probability distributions) by Ahlswede and Haroutunian; it is a generalization of two hypotheses LAO testing problem for one object investigated by Hoeffding, Csiszár and Longo, Tusnády, Longo and Sgarro, Birgé, and others.

1. Introduction

In [13] Ahlswede and Haroutunian formulated an ensemble of new problems on multiple hypotheses testing for many objects and on identification of hypotheses. These problems are extensions of those investigated in the books mentioned in [4, 5]. Problems of distribution identification and distributions ranking for one object were solved in [2]. Also the problem of hypotheses testing for the model consisting of two independent or two strictly dependent objects (when they cannot admit the same distribution) with two possible hypothetical distributions was investigated in [2]. In this paper we study the specific characteristics of the model consisting of K(≥3) objects which independently of others follow one of given M(≥2) probability distributions. The study concerns certain number K of similar objects (cities, institutions, schools, hospitals, factories, etc.), or one object in a series of K different periods of time. The problem is a generalization of two hypotheses testing investigated in [610] and of testing of many hypotheses concerning one object solved in [11]. The case of two independent objects with three hypotheses was examined in advanced edition [12], in a local publication of small circulation.

Investigation of testing of the distributions of many uniform objects is an interesting not yet fulfilled task. It is natural to begin this study with the simplest case of statistically independent objects.

Let 𝒫(𝒳) be the space of all probability distributions (PDs) on finite set 𝒳. There are given M distinct PDs Gm𝒫(𝒳), , which are known as possible distributions of the objects.

Let us recall main definitions from [11] for the case of one object. The random variable (RV) X, which is a characteristic of the studied object, takes values on 𝒳 and follows unknown PD G which is one of M given PDs Gm, . The statistician have to accept one of M hypotheses , on the base of a sequence of results x = (x1, …, xn, …, xN), xn𝒳, of N independent observations of the object. The procedure of the decision making is a nonrandomized test φN, which can be defined by division of the sample space 𝒳N into M disjoint subsets , . The set contains all vectors x for which the hypothesis Hl is adopted. The probability αlm(φN) of the erroneous acceptance of hypothesis Hl provided that Hm is true is equal to , lm, where We define the probability to reject Hm, when it is true, as
()
The exponential decrease of the error probabilities as N is studied. The error probability exponents which is pertinent to call reliabilities, of the sequence of tests φ, are defined as follows:
()
From (1) and (2) we see that
()
The matrix
()
we call it the reliability matrix of the sequence φ of tests. It was studied in [11]. The question is values of which number of elements of E(φ) can be given in advance and which optimal values can be guarantied by the best test for the others.

The sequence of tests φ* is called logarithmically asymptotically optimal (LAO) if for given positive values of first M − 1 diagonal elements of the matrix E(φ*) maximum possible values are provided to all other elements of it. The concept of LAO test was introduced by Birgé [10] and also elaborated in [11, 12].

Let us now consider the model with three objects. Let X1, X2, and X3 be independent RVs taking values in the same finite set 𝒳 with one of M PDs, this RVs are the characteristics of the corresponding independent objects. The random vector (X1, X2, X3) assumes values (x1, x2, x3) ∈ 𝒳3.

Let , , be a sequence of results of N independent observations of the vector (X1, X2, X3). The test have to determine unknown PDs of the objects on the base of observed data. The selection for each object should be made from the same set of hypotheses: Hm : G = Gm, . We call this procedure the compound test for three objects and denote it by ΦN, it can be composed of three individual tests , , for each of three objects. We denote the infinite sequence of compound tests by Φ. When we have K independent objects the test Φ is composed of tests φ1, φ2, …, φK.

Let be the probability of the erroneous acceptance of the hypotheses triple by the test ΦN provided that the triple of hypotheses is true, where (m1, m2, m3)≠(l1, l2, l3), , . The probability to reject a true triple of hypotheses by analogy with (1) is the following:
()
We study corresponding reliabilities of the sequence of tests Φ
()
Definitions (5) and (6) imply (cf. (3)) that
()

We call the test sequence Φ* LAO for the model with K objects if for given positive values of certain part of elements of the reliability matrix E*) the procedure provides maximal values for all other elements of it.

Our aim is to analyze the reliability matrix of LAO tests for three objects.

We consider the problem for three objects for brevity; the generalization of the problem for K independent objects will be discussed hereafter along the text and in Section 4, but before that we recall the results for one object. The generalization of the problem for cases when RVs Xi take values in different sets 𝒳i and have hypothetical PDs , , , will be only more complicated in notations.

2. LAO Testing of Many Hypotheses for One Object

We define the divergence (Kullback-Leibler distance) D(Q| |G) for PDs Q and G from 𝒫(𝒳) as usual:
()
For given positive diagonal elements E1∣1, E2∣2, …, EM−1∣M−1 of the reliability matrix we consider sets of PDs
()
and the following values for elements of the future reliability matrix of the LAO tests sequence:
()
()
()
()

We recall the theorem concerning one object.

Theorem 1 ([11]). If the distributions Gm are different, that is, all divergences D(Gl| |Gm), lm, , are strictly positive, then two statements hold:

(a)when the given numbers E1∣1, E2∣2, …, EM−1∣M−1 satisfy the conditions

()
()
then there exists an LAO sequence of tests φ*, the elements of the reliability matrix of which are defined in (10)–(13) and all of them are strictly positive;

(b) even if one of the conditions (14) or (15) is violated, then the reliability matrix of any such test includes at least one element equal to zero.

Corollary 1. The diagonal elements of the reliability matrix of the LAO test in each row are equal only to the element in the same row and in the last column:

()

That is, the elements of the last column are equal to the diagonal elements of the same row and due to (3) are minimal in this row. Consequently first M − 1 elements of the last column also can be considered as given parameters for construction of the LAO test.

Proof. For m = M (16) is the sequence of (3). From the conditions (14) and (15) we see that , , , hence can be equal only to one , for . Assume that (16) is not true, that is, , for one l ∈ [m + 1, M − 1].

Applying Kuhn-Tucker theorem for (11) we can derive (the proof is not difficult, but long, so we avoid the exposition) that the elements , may be determined by elements , ml, , with the following inverse functions:

()
Then it follows from (11) and our provisional supposition that
()
but one can see from conditions (14) and (15) that for . Our assumption is not correct, hence (16) is valid and equality (3) implies .

3. LAO Testing of Hypotheses for Three Independent Objects

Now let us consider the model of three independent objects and M hypotheses. It was noted that the compound test ΦN may be composed from separate tests , , . Let us denote by E(φi) the reliability matrices of the sequences of tests φi, , for each of the objects. The following lemma is a generalization of lemmas from [2, 12].

Lemma 1. If the elements Elm(φi), , are strictly positive, then the following equalities hold for Φ = (φ1, φ2, φ3):

()
()
()

Equalities (19) are valid also if for several pairs (mi, li) and several i’s.

Proof. It follows from the independence of the objects that

()

Remark that here we consider also the probabilities of right (not erroneous) decisions. Because Elm(φi) are strictly positive then the error probability tends to zero, when N. According to this fact we have

()
From definitions (5) and (6), equalities (22), and applying (23), we obtain relations (19)–(21).

Now we will show how we can erect the LAO test from the set of compound tests when 3(M − 1) strictly positive elements of the last column of the reliability matrix EM,M,Mm,M,M, EM,M,MM,m,M and EM,M,MM,M,m, , are preliminarily given.

The following subset of tests:

()
is distinguished by the property that when Φ ∈ 𝒟 the elements EM,M,Mm,M,M(Φ), EM,M,MM,m,M(Φ), and EM,M,MM,M,m(Φ), , of the reliability matrix are strictly positive.

Really, because Emm(φi) > 0, then in view of (3) EMm(φi) are also strictly positive. From equalities (23) keeping in mind (6), (16), and (22) we obtain that the noted elements are strictly positive for Φ ∈ 𝒟 and

()
Define the following family of decision sets of PDs for given positive elements EM,M,Mm,M,M, EM,M,MM,m,M, and EM,M,MM,M,m, :
()
Define also the values of the reliability matrix of the LAO test for three objects:
()
()
()
()

The following theorem is the main result of the present paper. It is a generalization and improvement of the corresponding theorem proved in [2] for the cases K=2, M=2.

Theorem 2. If all distributions Gm, , are different, (and equivalently D(Gl| |Gm) > 0, lm, ), then the following statements are valid:

(a) when given strictly positive elements EM,M,Mm,M,M, EM,M,MM,m,M, and EM,M,MM,M,m, , meet the following conditions:

()
()
()
()
then there exists an LAO test sequence Φ*𝒟, the reliability matrix of which is defined in (27)–(30) and all elements of it are positive,

(b) when even one of the inequalities (31)–(34) is violated, then there exists at least one element of the matrix E*) equal to 0.

Proof. The test Φ* = (φ1,*, φ2,*, φ3,*), where φi,*, are LAO tests of objects Xi, belongs to the set 𝒟. Our aim is to prove that such Φ* is a compound LAO test. Conditions (31)–(34) imply that inequalities analogous to (14) and (15) hold simultaneously for the tests for three separate objects.

Let the test Φ ∈ 𝒟 be such that

()

Taking into account (25) and (28) we can see that conditions (31)–(34) may be replaced by the following inequalities:

()

According to Corollary 1 in case of LAO test φi,*, , we obtain that (36) meets conditions (14)-(15) of Theorem 1. For each test Φ ∈ 𝒟, Emm(φi) > 0, , hence it follows from (3) that Eml(φi) are also strictly positive. Thus for a test Φ ∈ 𝒟 conditions of Lemma 1 are fulfilled and the elements of the reliability matrix E(Φ) coincide with elements of matrix E(φi), , or sums of them (see (19)–(21)). Then from definition of LAO test it follows that Elm(φi) ≤ Elm(φi,*), then . Consequently Φ* is an LAO test and verify (27)–(30).

(b) When even one of the inequalities (31)–(34) is violated, then at least one of inequalities (36) is violated. Then from Theorem 1 one of elements Eml(φi,*) is equal to zero. Suppose E3∣2(φ1,*) = 0, then the elements E3,m,l∣2,m,l*) = E3∣2(φ1,*) = 0.

Theorem 2 is proved.

4. On the Case of K(>  3) Objects

When we consider the model with K independent objects the generalization of Lemma 1 will take the following form.

Lemma 2. If elements , , are strictly positive, then the following equalities hold for Φ = (φ1, φ2, …, φK):

()

For given K(M − 1) strictly positive elements EM,M,…,Mm,M,…,M, EM,M,…,MM,m,…,M, …, EM,…,M,M,∣M,M…,m, , for K independent objects we can find the LAO test Φ* in a way similar to case of three independent objects. So the problem of many hypotheses testing for the model with K independent objects can be solved by corresponding sets , as in (27)–(30) and conditions analogous to (31)–(34).

5. Example

Some illustrations of exposed results are in examples concerning two objects. The set 𝒳 = {0,1} contains two elements and the following PDs are given on 𝒳: G1 = {0,10; 0,90}, G2 = {0,85; 0,15}, G3 = {0,23; 0,77}. As it follows from relations (28)–(30) of Lemma 2, several elements of the reliability matrix are functions of one of given elements, there are also elements which are functions of two or three given elements. For example, for a case of two objects in Figures 1 and 2 the results of calculations of functions E1,2∣2,1(E3,3∣1,3, E3,3∣3,2) and E1,2∣2,2(E3,3∣1,3) are presented. For these distributions we have min (D(G2| |G1), D(G3| |G1)) ≈ 2,2 and min (E2,2∣2,1, D(G3| |G2)) ≈ 1,4. We see that when the inequalities (32) or (33) are violated, then E1,2∣2,1 = 0 and E1,2∣2,2 = 0.

Description unavailable
Description unavailable

6. Conclusion

We exposed a solution of multiple hypothesis LAO testing problem for many objects. The first idea may be to study matrix E(Φ) by renumbering K-vectors of PDs from 1 to MK as PDs of one complex object. We can give MK − 1 diagonal elements of such matrix E(Φ) and apply Theorem 1 concerning one object. In this case the number of the preliminarily given elements of the matrix E(Φ) would be greater (because MK − 1 > K(M − 1), M ≥ 2, K ≥ 2), and the procedure of calculations would be longer than in our algorithm presented in Section 3.

Proposed approach to the problem gives also the possibility to define the LAO tests for each of the separate objects. It must be noted that the approach with renumbering of the triples of hypotheses does not have this opportunity.

In applications one of two approaches may be used in conformity with preferences of the investigator.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.