on minimal α-mean error parameter transmission over a poisson channel

11
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001 2505 On Minimal -Mean Error Parameter Transmission Over a Poisson Channel Marat V. Burnashev and Yury A. Kutoyants Abstract—We consider the problem of one-dimensional param- eter transmission over a Poisson channel when the input signal (in- tensity) obeys a peak energy constraint. We show that it is possible to choose input signals and an estimator in such a way that the mean-square error of parameter transmission will decrease expo- nentially with transmission time and we find the best pos- sible exponent. For more general loss functions of the type we find the best possible exponent if . If then some lower and upper bounds for the best pos- sible exponent are established. Index Terms— -mean risk, parameter transmission, Poisson channel. I. INTRODUCTION W E consider the problem of one-dimensional parameter transmission over a Poisson (direct-detection photon) channel [15], [8], [9]. Recall that if the channel input is a non- negative waveform (intensity) , , then the channel output , , is a random process with independent increments such that and for any we have where Assume now that we must transmit some parameter over the Poisson channel. For transmission we may use any input signal (intensity) satisfying only the peak energy constraint for any (1) where is some given constant. If we use a set of signals satisfying constraint (1) and an estimate , , then Manuscript received August 1, 1999; revised March 12, 2001. This work was supported in part under Grant N 98-01-04108 from the Russian Fund for Fun- damental Research. M. V. Burnashev is with the Institute for Problems of Information Trans- mission, Russian Academy of Sciences, 19 Bolshoi Karetni, 101447 Moscow, Russia (e-mail: [email protected]). Y. A. Kutoyants is with the Laboratoire de Statistique et Processus, Universite du Maine, 72085 Le Mans, Cedex, France (e-mail: [email protected]). Communicated by J. A. O’Sullivan, Associate Editor for Detection and Esti- mation. Publisher Item Identifier S 0018-9448(01)07020-1. we shall measure the quality of the system by the -mean error Let us suppose now that we may choose any set of signals satisfying (1), and may use any estimate . What is the minimum possible -mean error in that case? More formally, let us introduce the function where is taken over all signals satisfying (1), and is a subset of . Then natural problems are how to evaluate the function , and what is a good choice of the signals and the estimate . We are interested in the asymptotic behavior of the function when . Since it decreases exponentially when , we introduce also the function , giving the best possible exponent for -mean error (2) Such kinds of problems arise in a situation when we want “to inform” our partner about the analog value with the minimal possible -mean error. Of course, most interesting is the case that corresponds to the mean-square error. Traditional engineering motivations for this problem come from commu- nication theory where we should transmit the value over the channel. Detailed descriptions of this problem (with closely re- lated notions of modulation and threshold effect) can be found in [17], [10]. It should be emphasized that in the problem con- sidered we are not allowed to use block coding (i.e., collect se- quential values of in a block of some rather long length ). Otherwise, rate-distortion theory [12] would give an asymptotic (when ) solution. In the case of mean-square error (i.e., for ) and a white Gaussian noise channel, investigations on that and closely re- lated problems were started by Shannon in [12] and continued later in [14], [18], [19], [11], [7], [1]. It is natural not to confine ourselves to the mean-square error, but consider more general loss functions of the form . It is rather simple to construct a transmission method from which we get the lower bound for function , for example (for the white Gaussian noise channel see [14], [7], [1], [2], for the Poisson channel see Section II below). 0018–9448/01$10.00 © 2001 IEEE

Upload: ya

Post on 22-Sep-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001 2505

On Minimal�-Mean Error Parameter TransmissionOver a Poisson Channel

Marat V. Burnashev and Yury A. Kutoyants

Abstract—We consider the problem of one-dimensional param-eter transmission over a Poisson channel when the input signal (in-tensity) obeys a peak energy constraint. We show that it is possibleto choose input signals and an estimator in such a way that themean-square error of parameter transmission will decrease expo-nentially with transmission time and we find the best pos-sible exponent. For more general loss functions of the type wefind the best possible exponent if 0=(1+ 5) 2 1 618.If 0 0 then some lower and upper bounds for the best pos-sible exponent are established.

Index Terms— -mean risk, parameter transmission, Poissonchannel.

I. INTRODUCTION

WE consider the problem of one-dimensional parametertransmission over a Poisson (direct-detection photon)

channel [15], [8], [9]. Recall that if the channel input is a non-negative waveform (intensity) , , then the channeloutput , , is a random process with independentincrements such that and for anywe have

where

Assume now that we must transmit some parameterover the Poisson channel. For transmission we may use anyinput signal (intensity) satisfying only the peak energyconstraint

for any (1)

where is some given constant.If we use a set of signals satisfying constraint (1)

and an estimate , , then

Manuscript received August 1, 1999; revised March 12, 2001. This work wassupported in part under Grant N 98-01-04108 from the Russian Fund for Fun-damental Research.

M. V. Burnashev is with the Institute for Problems of Information Trans-mission, Russian Academy of Sciences, 19 Bolshoi Karetni, 101447 Moscow,Russia (e-mail: [email protected]).

Y. A. Kutoyants is with the Laboratoire de Statistique et Processus, Universitedu Maine, 72085 Le Mans, Cedex, France (e-mail: [email protected]).

Communicated by J. A. O’Sullivan, Associate Editor for Detection and Esti-mation.

Publisher Item Identifier S 0018-9448(01)07020-1.

we shall measure the quality of the system by the-mean error

Let us suppose now that we may choose any set of signalssatisfying (1), and may use any estimate. What is the

minimum possible -mean error in that case? More formally, letus introduce the function

where is taken over all signals satisfying (1),and is a subset of .

Then natural problems are how to evaluate the function, and what is a good choice of the signals

and the estimate.We are interested in the asymptotic behavior of the function

when . Since it decreases exponentiallywhen , we introduce also the function , giving thebest possible exponent for-mean error

(2)

Such kinds of problems arise in a situation when we want “toinform” our partner about the analog valuewith the minimalpossible -mean error. Of course, most interesting is the case

that corresponds to the mean-square error. Traditionalengineering motivations for this problem come from commu-nication theory where we should transmit the valueover thechannel. Detailed descriptions of this problem (with closely re-lated notions ofmodulationandthreshold effect) can be foundin [17], [10]. It should be emphasized that in the problem con-sidered we are not allowed to use block coding (i.e., collect se-quential values of in a block of some rather longlength ). Otherwise, rate-distortion theory [12] would give anasymptotic (when ) solution.

In the case of mean-square error (i.e., for ) and a whiteGaussian noise channel, investigations on that and closely re-lated problems were started by Shannon in [12] and continuedlater in [14], [18], [19], [11], [7], [1]. It is natural not to confineourselves to the mean-square error, but consider more generalloss functions of the form .

It is rather simple to construct a transmission method fromwhich we get the lower bound for function , for example

(for the white Gaussian noise channel see [14], [7],[1], [2], for the Poisson channel see Section II below).

0018–9448/01$10.00 © 2001 IEEE

2506 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001

It is difficult to get a good upper bound for . In the caseof the white Gaussian noise channel it took a good number ofresearch years. Recall its development for that channel and. Notice that from rate-distortion theory we can get

[17], [10]. A much better upper bound was obtainedby Ziv and Zakai in [19], that is still rather far (at least, from atheoretical viewpoint) from the lower bound . Thatupper bound was sequentially improved in [7], [1], [3], until in[4] the final result was established. Moreover, in[4] the exact formula , , was found.

The purpose of this work is to study the same problem forthe case of a Poisson channel which has become recently a verypopular research subject due to numerous optical and other ap-plications [15], [16]. It turns out that for such channels it is alsopossible to find the exact form of the function (and evenfor a larger range of parameterthan for the white Gaussiannoise channel). The following theorem presents the main resultof the paper.

Theorem 1: If then

(3)

In other words, if then

where is taken over all signals satisfying con-straint (1).

Clearly, determines the best exponential rate forthe mean-square error. Moreover, for , the functionturns out to be the same for both Poisson and white Gaussiannoise channels [4].

Remarks:

1) The function is very similar to the reliability func-tion of a Poisson channel [15], [16], [5]. Usingfunction , we will get the lower bound for func-tion . On the other hand, knowing function wecan get the exact upper bound for function (seeinequality (5) below), which is the most difficult part infinding the function .

2) For , a lower bound for function iscontained in Proposition 1, while some upper bounds canbe derived from Section III below.

3) For a rather long time, the white Gaussian noise channelwas an exceptional channel for which the exact form ofthe reliability function and some other optimalresults were known. Recently, a number of similaroptimal results have been obtained also for the Poissonchannel [15], [16], [5]. In that respect, this paper alsoextends the optimal result of [4] to the Poisson channel.In other words, now any optimal result known for thewhite Gaussian noise channel is known also for thePoisson channel (and so both channels are exceptional).

Moreover, the Poisson channel turns out to be simplerfor investigation than the Gaussian one.

In Section II, we derive the lower bound for function .Then we switch to getting upper bounds for function . First,in Section III, we obtain the Ziv–Zakai upper bound for (al-though it is rather simple, it is useful to have an idea of what canbe obtained using such simple arguments). Then, in Section IV,we derive an exact upper bound for when . In SectionV, we extend that exact upper bound to the case . Inthe Appendix, some pure analytical proofs are contained.

Finally, the authors repeat the words of C. E. Shannon from[13], devoted to an analogous problem: “It might be said thatthe algebra involved is in several places unusually tedious.”

II. L OWER BOUND FOR

The method we use to get the lower bound for is quitenatural and has already been used in [14], [18], and [1].

Proposition 1: The following lower bounds hold:

(4)

where is the unique root of the equation.

Proof: We divide the parameter set into (it willbe chosen later) intervals ; , of equal length

. For each interval , we put in correspondencesome signal ; . In other words, if the trueparameter value belongs to the interval then the signal

is transmitted. Based on observation , we find the mostlikely signal and choose as an estimatethe middle pointof the corresponding interval . Then for the -mean risk wehave

(5)

where is the (maximal) error probability whentesting signals ; .

It is known [15], [5] that we can choose signals ;such that for error probability we

will have

(6)

The upper bound (6) follows from a simple fact. Chooseon randomly two measurable subsets and ofLebesgue measure and consider signals ,

. Then, the error probability, when testing thosetwo signals, averaged over all random choices of and

, does not exceed . Therefore, if we randomlychoose such signals, then for averaged error probability weget the upper bound (6).

To minimize the right side of (5), we choose such that, i.e., we set

BURNASHEV AND KUTOYANTS: ON MINIMAL -MEAN ERROR PARAMETER TRANSMISSION 2507

Then from (5) and (6) we get

from which the lower bound (4) for follows.If with then for

a result better than (6) is known [5], [15], given by

(7)

where the parameter is the unique root of the equa-tion

(8)

For a given it is best to choose as the root of theequation , from which the lower bound (4) for

follows.

Remark: For small the right-hand side of (4) has the form.

III. Z IV–ZAKAI UPPERBOUND FOR

Getting a good upper bound for (or a lower bound forthe -mean error) is the most difficult part in studying this typeof problem. To explain the main difficulty let us take a look atinequality (5) which was used to get the upper bound for the

-mean error. In order to get a lower bound for the-mean errorit would be good to have some “inversion” of that inequality.Notice that in the case of error in testing of hypotheses weupper-bounded the value of error in the right side of (5) bythe maximal possible valuethat gave us the term .But is it possible to find another system of signals and anestimate such that the testing error probability willstill be as small as possible, and at the same time, the value ofthe error will be much smaller than? A geometricalinterpretation of that problem (in terms ofa signal curvein afunctional space andthreshold effect) can be found in [17], [10].This question makes our problem different from the traditionaltesting of hypotheses problems (where we do not have a lossfunction). The answer to this question represents the main dif-ficulty in the problem considered. Essentially, we show belowthat such a system of signals does not exist.

The first step in getting a lower bound for the-mean error isalmost always the same. We choose on the parameter setthe finite subset of equally spaced points

; where the number will be chosenlater. Then we have

(9)

Clearly, if we choose rather large then in transition (9) wepractically do not lose any accuracy. The right-hand side of (9)means that parametermay take values only from the set .Therefore, we come to the problem of testing best chosensignals (hypotheses) ; with loss func-tion . Signals ; should satisfy only

constraint (3). Quite similar to [5, Proposition 1] (see also [16,Theorem 2.1]) it is possible to show that, without loss of accu-racy, we may assume that all signals ;take on only the extreme valuesand .

As a result, we have input signals ;taking on only the extreme valuesand . Denote

(10)

and let denote the Lebesgue measure of set(all our setsof interest are measurable). Then, clearly, ,

. We shall call thesupportof signal .We shall always have when . Since we are

interested only in function (i.e., only in logarithmic asymp-totics of the -mean error), without loss of generality, we mayassume that all signals ; have supports

of the same Lebesgue measure with some(i.e., all signals have the same energy). In order to show this,we partition the range of all possible Lebesgue measureson equal segments of some small length. Then there existsa segment of the form such that at leastsignals among all signals have supports with Lebesgue mea-sures in that segment. We can enlarge the observation timeby and modify each of those signals on that additional setin such a way that all those new signals will have supportswith the same Lebesgue measure. If we assume that thosenew signals correspond to then their -meanerror cannot be larger than that for the original signal system.Choosing we come to the problem when on time in-terval and there are sig-nals of equal energy . Clearly, the minimum possible-meanerror for that new problem can be only smaller than that for theoriginal problem, and, therefore, any upper bound for its func-tion will remain valid for the original problem as well (infact, function remains the same).

Therefore, it is sufficient to consider the case when we havean arbitrary number of input signals ;taking on only extreme values and and, moreover, allof them have supports of the same Lebesgue measurewith some .

First, we get the simplest upper bound for based on the“two points” principle [6] (in information theory traditionallycalled the Ziv–Zakai bound [19]). After that, we shall strengthenit a little bit and it will be clear what we should do in order toget better results.

We continue (9) as follows:

(11)

where is the minimum possible average errorprobability when testing between two equiprobable signals

and .When testing between equiprobable signals and

we have “uncertainty” only if there are no photons on the set(in that case, we make a decision error with

2508 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001

probability ). Therefore, for the sum of both error probabil-ities we have

no photons on

(12)

Now we use the elementary result [5].

Lemma 1: Let be a collection of measur-able sets on , such that ; .Then

(13)

In particular, if , then

(14)

Returning to our problem, we notice that the smallest distancein (11) is not less than and the average value of

satisfies inequality (14). Then we get from (11), (12), and(14)

where we use the optimal value . Choosingwe get

(15)

In the case , we can strengthen upper bound (15)in the following way. We choose in the parameter set thesubset of equally spaced points

; (the value of will be chosen later) andassume that parametermay take values only from the set .Moreover, instead of (11) we can write (see details in [1])

(16)

where is the same as in (5). If then, where function ,

, was defined in (7) and (8). Then, for ,, we get from (16), (7), and (8)

For the derivative of function we have

Therefore, the minimum of function is attained at. This result, together with (15), gives the following.

Proposition 2: The following upper bound holds:

.(17)

Remark: For small , the right-hand side of (17) has the form.

It should be noted that using only the “two points” principlefor simplehypotheses (like we did deriving (15)), it is impos-sible to get an upper bound better than (15). It follows from thefact that we can construct on an arbitrary supports

, such that we will have forany .

For that reason, in order to get an upper bound better than(15) we must consider an exponential number ofhypotheses . Then we come to the main difficulty in theproblem considered. The difference now may vary from

(i.e., from an exponentially small value) up to a value oforder and therefore we must be able to take into account thatdifference.

It can also be said that the reliability function of thechannel defines the minimal possible error probability whentesting hypotheses , but it says nothing about thevalueof the error . Essentially, we should show that if the errorprobability (when testing signals ) is very small (e.g., isdetermined by the reliability function ) then the value ofthe error will be large.

IV. EXACT UPPERBOUND FOR

Now following [1], [3], [4] we consider an approach that willallow us to improve the upper bound (17). Although the con-tent of this section is self-contained, it may be helpful to refer,for example, to [3] for some geometrical explanations of the ap-proach used.

We present first an auxiliary result that will serve us as themain tool.

Let be the output measure in observation spacewhensignal is transmitted, . Since all measures

are dominated by measureon , let , , be the Radon–Nikodim den-sity of with respect to .

In parameter set , we choose an arbitrary subsetof cardinality . Assume that for

each pair , , we define in observation spacesome set (maybe empty) such that the following

conditions are fulfilled:

1) for any2) for any3) for any

(18)

Lemma 2: For any and any collection of sub-sets satisfying conditions (18) the following lower boundholds:

(19)

BURNASHEV AND KUTOYANTS: ON MINIMAL -MEAN ERROR PARAMETER TRANSMISSION 2509

Proof: We have

where we used the simple inequality

First we apply Lemma 2 to the case when and

with ; . To every point (hypothesis)we put in correspondence some set of points

in such a way that if , then . We may imaginethat all points of the set are vertices and we design someedges among them. As a result, we get some undirected graph

. Loosely speaking, when point is true (transmitted) thenamong all possible errors we will take into account only decisionerrors in favor of points of the set . How to design the graph

will be clear later.Denote by the total number of graph edges. We will

number all edges of graph by integers from to. Later, we shall consider not only graph, consisting of

points , but also the analogous graphconsisting of thesignals . It will be clear from the context which graphwe mean.

In the observation space , for each connected pair ,we introduce the set of events (paths)

1) there are no photons on2) there are exactly photons on

(20)

Clearly, for any we have, i.e., if we get an observation then

both hypotheses and have equala posterioriprobabilities.For fixed , sets may intersect (and similarly, for

fixed ). In order to avoid those intersections, we introduce setsof events , constructing them from sets

(21)

Due to construction, for fixed, sets do not intersect(and, similarly, for fixed ). Therefore, for sets , allconditions of Lemma 2 are fulfilled. Notice that

(22)

where, similarly to (10), we denote .Therefore, from (21) and (22), we get

We shall use parameterof the form .1 Clearly,is the expected number of photons on set if signals

or were transmitted.

Remark: As will be seen below, the smaller the value, thelarger the number of signals we must consider. Our maininterest is the case . It turns out that in that case the number

corresponds to a linear part of the channel reliability function. In the case of relatively small values of, the number

corresponds to a sphere-packing part of function and thenit is better to use the form , with some .See details for the Gaussian case in [3], [4].

Now using Stirling’s formula ,we get

(23)

We always choose graph edges such that

1) all values for its edges will be ap-proximately equal;

2) sets will have many points, but each coefficienton the right side of (23) will not exceed .

1We shall not use integer part sign.

2510 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001

It is convenient to “quantize” the range of possible values ofintersections (and , as well). For that purpose, wechoose some relatively small(of order ) and partition thewhole range of intersections on subintervals of length

. Clearly, the number of such subintervals does not exceed.

Remark: In the Gaussian case pairwise distances, ,of any three equal energy signals , ,

uniquely determine the probability of their “ambiguity zone”(i.e., the set of all with

) (see [1], [3], [4]). It is based on the fact thatany Gaussian process is uniquely defined via its correlation(i.e., two arguments) function. Unfortunately, the Poissoncase does not have such a property. The probability of the“ambiguity zone” for three signals , , in thePoisson case is defined by the triple intersection value ,and concerning it we can get only relation

where is to a certain extent a free parameter.

For that reason, we quantize also triple intersections, intro-ducing sets

Then, for edge with , we getfrom (23)

(24)

Before choosing sets (i.e., graph edges) we shouldmake one more remark. The numberof points will al-ways be exponentially large (i.e., with some ).All values of interest will also be exponentially large or expo-nentially small on . For that reason, in order to avoid somebulky formulas, we pay attention below only to exponents andby denote some strongly positive polynomials on(notnecessarily the same in different formulas).

Now we choose sets . Notice that due to inequality (14)the average pairwise intersection value is not less than

(for large ). Let be the number of pairs with. Then we have

from which we get

Therefore, there exist some level and at leastpairs with . It means

also that there exist at least points such that each ofthem have at least neighbors on level .

If there is more than one such , we choose the maximalvalue. Now as edges of graph , we designate all pairs

with .We also use the trick of randomly choosing numeration num-

bers for edges of graph . Then denoting by andthe expectation and probability over all possible equiprobablenumerations , respectively, we get from (24) for any pair

(25)

Now we lower-bound the probability .Introduce numbers

(26)

For edge these define the maximal cardinality of the setsand such that, after summation over

all , the value of will not exceed .Introduce also sets of levels

If sets and are empty then .If those sets are not empty then, taking into account that thenumber of different does not exceed , we get

(27)

The proof of lower bound (27) for probability is given inthe Appendix.

BURNASHEV AND KUTOYANTS: ON MINIMAL -MEAN ERROR PARAMETER TRANSMISSION 2511

Then, for any pair , we get from (25) and (27)

(28)

Now we apply the estimate in lower bound (19). Recall that thereexist at least points such that each of them have atleast neighbors on level . Moreover, without loss of gen-erality, we may assume that (otherwise, the situationcan be reduced to the case when signals have lessthan energy). Then we get

(29)

Notice that if then lower bound (29) takes theform

(30)

Assume now that in the lower bound (29) the value. It means that there exist pairs and some level

such that

or

(31)

If there is more than one such we choose the one minimizingthe values . To be specific, we assumethat the first of the inequalities (31) is fulfilled (another case canbe considered similarly). Then for such pair introducesets

Since the number of different levelsdoes not exceed, thereexists some such that

(i.e., the exponents of and areequal). If there is more than one suchwe choose the maximalvalue. Notice that then the lower bound (29) takes the form

(32)

Consider now the set of points andtheir supports . We evaluate the average pairwise value

for supports from the set . Forthat purpose, we make two important observations concerningthe relation between parameters and the structure of set

.First we notice that set serves as a “guarding

layer” for signal against all other signals with. Intersections for those signals must

be uniformly distributed on support . Then intersectionsfor signals from set must also be

uniformly distributed on support . In particular, we musthave

It means that we must have andhence .

Second, we notice that allocation of signal supportwith respect to all other signals (with )must be the same as for all signals from set

. It means that there exists some setcontaining supports of all signals from .Moreover, all those supports must be uniformly distributed on

. Let be the Lebesgue measure of set. Then we haveand hence .

As a result, we get that all supports from setbelong to some set with Lebesgue

measure . Denote by the average value offor them. Then, due to inequality (14), we have

(33)

Now we can proceed in several ways:

1) we can apply “two points” arguments for the set;

2) since the essential part of signals have at leastneighbors on level we can consider a

new graph consisting of edges with;

3) for set , we can repeat the same arguments(i.e., construct some graph) as we did for the originalset of signals .

It turns out that for our purpose it is sufficient to apply the sim-plest “two points” arguments for the set .Assuming that all points from that set are located maxi-mally close to each other, we get another lower bound

(34)

Therefore, if in lower bound (29) the factor , thenboth lower bounds (32) and (34) are valid. As a result, from (30),(32), and (34) we get

(35)

For the chosen we should minimize the right-hand side of(35) over and . Clearly, it

2512 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001

is best to choose the that makes both terms under thesign on the right-hand side of (35) equal, i.e., to put

(36)

Since we should have , for such a choice ofwe must have

and

Now we have from (35) and (36)

and then, for , we get

(37)

We should minimize the right-hand side of (37) over. The optimal choice is . Assuming that

, and so , we get

(38)

where, in the last line, we chose .In order to investigate the right-hand side of inequality (38),

consider the function

We have

Moreover, Therefore, if thenthe -convex function , , attains its maximumwhen , and, therefore,

Since , we get from (38)

(39)

In fact, in order to be accurate, we must also check the ex-treme points and when looking for the max-imum in (35). If , then we have from (35)

which gives the same result as (39). If then we havefrom (37)

which also gives the same result as (39). Therefore, we provedthe following.

Proposition 3: The following upper bound holds:

(40)

V. UPPERBOUND FOR

Now using the second approach mentioned above, we extendthe lower bound (40) down to some . We omit some tech-nical details that are quite similar to the previous section. Forthat purpose, we consider all pairs withand corresponding sets , . If forthe essential part (i.e., for ) of pairs we have

for some then .Moreover, for any such pair , the supports of allthose signals from are uniformly distributedon some set . If is Lebesgue measure of setthen we have .

Consider now a new graph containing edges with. Introduce sets

Denoting, similar to (29), we have

(41)

If , then there exists some such that

and we have

Consider now separately those points. We mayconsider that they have supports of measure onand supports of measure on . Then, ap-plying Lemma 1, we get that the average value for themis not less than

(42)

Applying now “two points” arguments for set weget another lower bound

(43)

BURNASHEV AND KUTOYANTS: ON MINIMAL -MEAN ERROR PARAMETER TRANSMISSION 2513

Then, similarly to (35), we get ( , ,, )

(44)

In order to minimize the right-hand side of (44) it is best tochoose such that ,i.e., to put

Then we have

and now, from (44) with , we get

where is defined in (42), and we must maximize the functionover .

Introducing new variables and , we getthe upper bound

(45)

Notice that . Then, inorder to enlarge the range of validity of upper bound (40), it issufficient to find such that

(46)

For derivatives of function we have

Therefore, is -convex on and attains its max-imum as shown in (47) at the bottom of this page.

Consider first the valuessuch that . Then the optimaland denoting , we get

Now, in order to have inequality (46) fulfilled, it is sufficient tohave fulfilled the inequality

(48)

For derivatives of we have

Let the maximum of over be attainedat some point . It is easy to check that .Assume that for we have

Then, from these two equations it follows thatmust satisfy

(49)

Notice that for and . Indeed,

Notice also that and

(50)

(47)

2514 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 6, SEPTEMBER 2001

Therefore, for (49) does not have a solution .Since , we must put and then we get

from which the inequality (48) follows.Consider now the case when satisfies the opposite in-

equality . Then notice that

and

Therefore, it is sufficient to have fulfilled the inequality

or, equivalently,

(51)

For derivatives of we have

We have also , , and

From those facts, the validity of inequality (51) follows. In turn,together with (48), the validity of (46), with defined in (50),is proven. Therefore, from (45) and (46) we get the following.

Proposition 4: The following upper bound holds:

(52)

Now from (52) and (4) assertion (3) of Theorem 1 follows.

APPENDIX

Proof of Inequality (27): We may imagine that we haveballs of different colors . The number of colors is

. The number of balls of color isor . There are given also

numbers . The total number of balls is

. We enumerate randomly all balls by different integersfrom up to . We also choose randomly a numberbetweenand (it represents ). Denote number of balls of color

with numbers between and . Then

and .Let and choose some . Then we have

Temporarily denoting , , , we have

Notice that

Therefore, for , using Stirling’s formula we get

(A1)

Let satisfy the condition . Then, and since , we get

Consider now function from (A1). Representing in theform , and using Taylor’s formula, wehave for some

BURNASHEV AND KUTOYANTS: ON MINIMAL -MEAN ERROR PARAMETER TRANSMISSION 2515

where we used the condition . Therefore, we get

Since , then if in addition ,then

Therefore, if , we have

Therefore, if satisfies , then

Hence, we get

from which inequality (27) follows.

ACKNOWLEDGMENT

The authors are grateful to the two anonymous referees forhelpful comments.

REFERENCES

[1] M. V. Burnashev, “Bounds for achievable accuracy in parameter trans-mission over the white Gaussian channel,”Probl. Inform. Transm., vol.13, no. 4, pp. 9–24, 1977.

[2] , “Properties of deep frequency modulation with a threshold effect,”Probl. Inform. Transm., vol. 18, no. 2, pp. 30–43, 1982.

[3] , “A new lower bound for the�-mean error of parameter transmis-sion over the white Gaussian channel,”IEEE Trans. Inform. Theory, vol.IT-30, pp. 23–34, Jan. 1984.

[4] , “On a minimum attainable mean-square error for parameter trans-mission over the white Gaussian channel,”Probl. Inform. Transm., vol.21, no. 4, pp. 3–16, 1985.

[5] M. V. Burnashev and Yu. A. Kutoyants, “On sphere-packing bound, ca-pacity and related results for Poisson channel,”Probl. Inform. Transm.,vol. 35, no. 2, pp. 3–22, 1999.

[6] D. G. Chapman and H. Robbins, “Minimum variance estimation withoutregularity assumptions,”Ann. Math. Statist., vol. 22, pp. 581–586, 1951.

[7] D. L. Cohn, “Minimum mean square error without coding,” Ph.D. dis-sertation, MIT, Cambridge, MA, 1970.

[8] Yu. A. Kutoyants, Parameter Estimation for Stochastic Pro-cesses. Berlin, Germany: Heldermann, 1984.

[9] , Statistical Inference for Spatial Poisson Processes (Lecture Notesin Statistics). New York–Berlin: Springer-Verlag, 1988, vol. 134.

[10] D. J. Sakrison,Notes on Analog Communication. New York: Van Nos-trand Reinhold, 1970.

[11] L. Seidman, “Performance limitations and error calculations for param-eter estimation,”Proc. IEEE, vol. 58, pp. 644–652, May 1970.

[12] C. E. Shannon, “Coding theorems for a discrete source with a fidelitycriterion,” IRE Nat. Conv. Rec., no. 4, pp. 142–163, 1959.

[13] , “Probability of error for optimal codes in a Gaussian channel,”Bell Syst. Tech. J., vol. 38, no. 3, pp. 611–656, 1959.

[14] A. D. Wyner, “Communication of analog data from a Gaussian sourceover a noisy channel,”Bell Syst. Tech. J., vol. 47, no. 5, pp. 801–812,1968.

[15] , “Capacity and error exponent for the direct detection photonchannel—Part I,”IEEE Trans. Inform. Theory, vol. 34, pp. 1449–1461,Dec. 1988.

[16] , “Capacity and error exponent for the direct detection photonchannel—Part II,”IEEE Trans. Inform. Theory, vol. 34, pp. 1462–1471,Dec. 1988.

[17] J. M. Wozencraft and I. M. Jacobs,Principles of Communication Engi-neering. New York: Wiley, 1965.

[18] A. D. Wyner and J. Ziv, “On communication of analog data froma bounded source space,”Bell Syst. Tech. J., vol. 48, no. 10, pp.3139–3172, 1969.

[19] J. Ziv and M. Zakai, “Some lower bounds on signal parameter estima-tion,” IEEE Trans. Inform. Theory, vol. IT-15, pp. 386–391, May 1969.