the analysis of asthma control under a markov …the analysis of asthma control under a markov...

23
The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre †∗ , C. Combescure , JP. Daur` es and P. Godard Laboratoire de Biostatistique, Institut Universitaire de Recherche Clinique, 641 avenue de Doyen Gaston Giraud, 34093 Montpellier, France. Service de Pneumologie, CHU Hˆopital Arnaud de Villeneuve, Montpellier, France. Correspondence to: Philippe Saint-Pierre, Laboratoire de Biostatistique, Insti- tut Universitaire de Recherche Clinique, 641 avenue de Doyen Gaston Giraud, 34093 Montpellier, France. Tel: + 33 4 67 41 59 21 Fax: + 33 4 67 54 27 31 e-mail: [email protected] This research was supported by a grant (4AS04F) from INSERM (Institut National de la Sant´ e et de la Recherche M´ edicale), FRANCE and by ARIA (Association pour la Recherche en Intelligence Artificielle). 1 Presented at the International Society for Clinical Biostatistics Twenty-Third International Meeting, Dijon, France, September 2002 1

Upload: others

Post on 13-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

The analysis of asthma control under a Markovassumption with use of covariates 1

P. Saint-Pierre†∗, C. Combescure†, JP. Daures† and P. Godard‡

† Laboratoire de Biostatistique, Institut Universitaire de Recherche Clinique, 641avenue de Doyen Gaston Giraud, 34093 Montpellier, France.

‡ Service de Pneumologie, CHU Hopital Arnaud de Villeneuve, Montpellier, France.

∗ Correspondence to: Philippe Saint-Pierre, Laboratoire de Biostatistique, Insti-tut Universitaire de Recherche Clinique, 641 avenue de Doyen Gaston Giraud, 34093Montpellier, France.

Tel: + 33 4 67 41 59 21Fax: + 33 4 67 54 27 31e-mail: [email protected]

This research was supported by a grant (4AS04F) from INSERM (Institut Nationalde la Sante et de la Recherche Medicale), FRANCE and by ARIA (Association pourla Recherche en Intelligence Artificielle).

1Presented at the International Society for Clinical Biostatistics Twenty-Third InternationalMeeting, Dijon, France, September 2002

1

Page 2: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

The analysis of asthma control under a Markovassumption with use of covariates 2

SUMMARYIn studies of disease states and their relation to evolution, data on the state are

usually obtained at infrequent time points during follow-up. Moreover in many ap-plications, there are measured covariates on each individual under study, and interestcenters on the relationship between these covariates and the disease evolution. Wedeveloped a continuous-time Markov model with use of time-dependent covariatesand a Markov model with piecewise constant intensities to model asthma evolution.Methods to estimate the effect of covariates on transition intensities, to test the as-sumption of time homogeneity and to assess goodness-of-fit are proposed. We applythese methods to asthma control. We consider a three-state model, and we discussin detail the analysis of asthma control evolution.

Key words: Longitudinal data on asthma; Continuous-time Markov processes;Time-dependent covariates; Piecewise constant intensities; Maximum likelihood esti-mation.

2Presented at the International Society for Clinical Biostatistics Twenty-Third InternationalMeeting, Dijon, France, September 2002

2

Page 3: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

1 INTRODUCTION

In longitudinal studies of disease it is often useful to model the passage of subjectsthrough disease stages or states. Indeed subjects are often followed intermittently andusually available information is in the form of some health measure or disease state atseveral discrete points in time. The exact transition times between disease states aregenerally not observed for many subjects and only a fairly short portion of individualdisease histories is usually observed. The data obtained with such observationalschemes makes many complicated models such as non homogeneous Markov or semi-Markov models very difficult to deal with and usually one resorts to time-homogeneousMarkov models with simple transition structures. Reasons for constructing multi-state models are to provide a comprehensive view of the disease process and to allowestimation of proportions of individuals who will be in the various states at some timein the future. Moreover a continuous time Markov model does not require strongassumptions about times of disease onset, and it has had successful applications tothe stages of cancer [3] and the stages of HIV infection [2, 4]. Indeed continuous-time Markov models have been used for a long time to model the natural diseaseshistories [1-5, 15], and some most recent works[11-14] developed the previous theoryin particular models. Among the references, [11] used irreversible disease stages, [13]used a two states model, [14] developed a goodness of fit test for stationary andcontinuous time Markov models.

In Markov models, covariates affecting the transition intensities are also of greatinterest. Indeed the introduction of covariates in the model allows precision of prob-abilities tailored to individual patients. The interest is to analyse and to bring outthe relationship between the covariates and the different transition probabilities inthe Markov model. But only a few previous studies have used Markov models withcovariates [5, 12].

Time-homogeneous Markov models are usually used to model evolution in chronicdisease but such models obviously put severe limitations on disease history behaviour.Since transition intensities seem unlikely to be constant over long periods for mostdisease processes and since Markov assumptions may be violated, it is important touse another model and to have assessment tools. Hence some time-inhomogeneousmodels are required. A simple way to do this, while preserving the tractability ofconstant intensities, is to use Markov models with piecewise constant transition in-tensities [15] which can prove to be useful and have had successful applications.

The purposes of this paper are threefold. The first is to summarize methodology,hypothesis testing procedures and methods for model checking on time-homogeneousMarkov model with time-dependent covariates. The second is to discuss a simpleapproach based on using appropriate time-dependent covariates in a modified homo-geneous Markov model and to fit Markov models with piecewise constant intensities.

3

Page 4: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

The third purpose is to discuss the application of these results on Markov modelsto a database on asthma. As regards literature on asthma evolution, some previousstudies have focused on asthma evolution using a Markov background [16-21]. Jain[17] used a discrete-time Markov model, Redline et al. [19] adjusted a Markov-typeautoregressive model, whereas studies [20, 21] used a continuous-time Markov modelwithout covariates to model asthma control evolution.

Section 2 describes the Markov model with time-dependent covariates and themodel with piecewise constant intensities to be used. Methods of inference basedon maximum likelihood are discussed in Section 3 together with hypothesis testingprocedures and methods for model checking. Analysis of the data on asthma in theapplication will be presented in detail in Section 4. Section 5 concludes the paper.

2 CONTINUOUS-TIME MARKOV PROCESSES

2.1 Time-homogeneous Markov models

Consider a model consisting of k states belonging in the state space S = {1, 2, ..., k},with an individual being unequivocally in some state at time t. Let Y (t) denote thestate occupied at time t by a randomly chosen individual. Assume that individualsindependently move among k states according to a continuous-time homogeneousMarkov process. For 0 ≤ s ≤ t, let P (s, t) be the k × k transition probability matrixwith entries

pij(s, t) = Pr{X(t) = j | X(s) = i}, i, j = 1, ..., k. (1)

This process can be specified in terms of the transition intensities,

qij(t) = lim∆t→0

pij(t, t+∆t)

∆t, i = j, (2)

qii(t) = −∑i=j

qij(t), i = 1, ..., k,

and let Q(t) be the k × k transition intensity matrix with entries qij(t).This section is concerned with time-homogeneous models in which qij(t) = qij

independent of t. In this case the transition probabilities are stationary and can becalculated as

P (s, s+ t) = P (0, t) = P (t) = exp(Qt), (3)

where Q = (qij) is the transition intensity matrix. Note that qij ≥ 0 for i = jand that

∑∞j=1 qij = 0. A simple procedure for computing the transition probability

4

Page 5: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

matrix P (t) = exp(Qt) in terms of eigenvalues and eigenvectors of Q was suggestedby Cox and Miller [7]. Clearly, P (t) is calculated as

P (t) = A diag(ed1t, ed2t, ..., edkt) A−1, (4)

where d1, d2, ..., dk are the eigenvalues of Q and A is the matrix whose jth column isthe eigenvector associated with dj. Note that the transition probabilities satisfy theChapman-Kolmogorov equations

Pij(s, t) =∑k∈S

Pik(s, u)Pkj(u, t) (s ≤ u ≤ t). (5)

2.2 The incorporation of covariates

In many applications, there are measured covariates on each individuals under study,and interest centers on the relationship between these covariates and the intensitiesqij in the Markov model. The model above can be extended in a straightforward wayto allow for regression modelling of Q. If we assume that the proportional intensitiesregression models hold, then the transition intensities can be expressed as

qij(z) = qij0 exp(β′

ijz) i = j, (6)

where z is a s-dimensional vector of covariates, βij is a vector of s regression pa-rameters relating the instantaneous rate of transitions from state i to state j to thecovariates z and qij0 represents the baseline intensity relating to the transition fromstate i to state j. Note that the regression coefficients can be interpreted similarlyto those in the proportional hazards regression model [6]. The resulting transitionintensity matrix Q(z) for a subject with vector of covariates z with elements qij(z)can be used in equations (1) and (2) to compute the transition probability matrixP (t | z). The elements pij(t | z) of this transition probability matrix constitute thecontribution of each observation to the likelihood function.

It is also important to note that covariates can be time-dependent covariates.Indeed the total contribution of an individual to the likelihood function is the resultof the product of the contribution from each observed transition. The model can beadapted to handle time-dependent covariates by replacing the time-invariate covariatecontribution, pij(t− s | z), with pij(t− s | z(s)) by assuming that the time-dependentcovariate remains constant between the two consecutive times s and t. In that case

qij(t | z(t)) = qij0 exp{β′

ijz(t)} i = j. (7)

The assumption that the covariates do not change between assessments regardless ofhow far part the assessments are is a strong one for many studies. It is then useful to

5

Page 6: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

fix one duration after which the patient drops out of the study until a new assessmentoccurs. By varying the duration, it is interesting to see how sensitive any conclusionsto this aspect of the analysis are.

A log-linear model for the Markov rates qij(z), is chosen primarily for analyticalconvenience and because this model has the attractive feature of yielding nonnega-tive transition intensities for any z and βij’s; other parameterizations may be moreappropriate in particular applications. Moreover, modelling on related scales such aslog-hazard makes it possible to study the way that the baseline probabilities are mod-ified by covariates. The log-linear model, which is nearly additive, makes it possibleto interpret in a convenient way the value of the coefficient βij,k, which is not alwaysthe case with other models.

Note that an increasing number of covariates can make computation too exten-sive to be easily implemented. Indeed, an increase in the number of covariates (orregression coefficients) requires more information in the data and more computationalresources because the likelihood becomes very difficult to compute. To overcome thisdifficulty, a solution is the use of simplified versions of the models to reduce thenumber of parameters. According to the study population, one might use a simplersub-model with certain effects common, for example, Marshall et al. (1997) usedthe same coefficients for transitions from state j to j + 1 and same coefficients fortransitions from state j to j − 1. One might also use sub-models with certain effectsequal to zero [12].

2.3 Markov models with piecewise constant intensities

A time-homogeneous Markov model with time-dependent covariates allows one todeal with nonhomogeneous Markov model and particularly with a piecewise constantintensities model. Let us divide the time axis into intervals [τl−1, τl), where l = 1, ...,r + 1 and τr+1 = +∞, and assume constant intensity for each type of transition ineach interval.

We consider a vector z∗(t) = (z∗0(t), z∗1(t), ..., z

∗r (t))

′ of artificial time-dependentcovariates defined as

z∗0(t) = 0 ∀t

z∗l (t) =

{01

if τ0 ≤ t < τlif t ≥ τl

for l = 1, 2, ..., r,(8)

and the model with transition intensities

qij(t | z∗(t)) = qij0 exp{(β∗ij)

′z∗(t)} i = j. (9)

In this model, the intensities vary with time t as step-functions defined on the pre-specified intervals [τl−1, τl) , l = 1, ..., r + 1; time is measured from the beginning of

6

Page 7: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

the process under study. The parameters of the model are the baseline intensities qij0,which represent the transition intensities in the interval [τ0, τ1) and the vector of re-gression coefficients β∗

ij associated with the artificial time-dependent covariates. Notethat this model leads to a nonhomogeneous Markov model in which the transitionintensities are step-functions of time. Clearly, we have

qij(t | z∗(t)) =

qij0qij1 = qij0 exp{β∗

ij,1}...qijr = qij0 exp{β∗

ij,1 + β∗ij,2 + ...+ β∗

ij,r}

if τ0 ≤ t < τ1if τ1 ≤ t < τ2

if t ≥ τr.

(10)Note that this model generalizes the homogeneous case using r = 0.

3 STATISTICAL METHODS

3.1 Maximum likelihood estimation

3.1.1 Estimation in time-homogeneous Markov models with covariates

The observed data for subject h (h = 1, 2, ..., n) consist of yh = (yh,0, yh,1, ...,yh,nh

)′ and the values of the covariates vectors zh,0, zh,1, ..., zh,nh, where yh,j = Y (th,j),

zh,j = z(th,j), for j = 0, 1, ..., nh, and th,0 < th,1 < ..., < th,nhare the successive follow-

up times. It is assumed that zh,j−1 remains constant between the two consecutive timeth,j−1 and th,j. For fixed covariates, zh,j = zh,0 for all j. The total contribution of anindividual to the likelihood function is the result of the product of the contributionfrom each observed transition. Then, conditionally on the fact that Y (th,0) = yh,0,the contribution of subject h to the likelihood function is

lh =

nh∏j=1

pyh,j−1,yh,j(th,j − th,j−1 | zh,j−1). (11)

Note that in a model with a single absorbing state, lh is slightly different. Thiscase is developed in detail by Kay [3]. The full likelihood function is the productof all individual contributions. Maximum likelihood estimates for the baseline tran-sition intensities qij0 and regression coefficients βij can be obtained by maximizingthe likelihood function with respect to these parameters. Asymptotic estimates ofthe standard errors of the estimates can be obtained by inverting the empirical infor-mation matrix. Quasi-Newton algorithms can be used to find maximum likelihoodestimates using only a analytical expression for the likelihood function and using fi-nite differences to obtain numerical approximations of the derivatives. A complete

7

Page 8: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

discussion of the scoring procedure methods can be found in Kalbfleisch and Lawless[1].

With regard to the starting values for the iterative procedure, assume that thetime points th,j represent exact transition times between the states. Let Tij, i, j = 1,..., k, be the total time spent by all individuals in state i before passing to state j.Let bij, i, j = 1, ..., k, be the total number of transitions from state i to state j. Themaximum likelihood estimates of the baseline intensities are then

q(0)ij0 =

bijTij

, i, j = 1, ..., k, (12)

and these can be used as starting values in the iterative procedure. Some of theirvalues may be zero (bij = 0) or undefined (Tij = 0). In these cases, “average” values

q(0)ij0 =

∑i

∑j bij∑

i

∑j Tij

, (13)

should be used. As regards the starting values for the regression coefficients, all β(0)ij

are chosen to be equal to zero, which means that there are no effects of covariates.

3.1.2 Estimation in Markov models with piecewise constant intensities

To fit this model, the likelihood function must be modified. Indeed the transitionintensities are constant in each time interval, but they are different for one interval toanother. For all i, j ∈ S, let P l

ij(t) denote the transition probability associated with alltime intervals contained in [τl−1, τl) , l = 1, ..., r+1; more precisely, Pij(s, s+t) = P l

ij(t)if τl−1 ≤ s < s+ t < τl. For any t, let It denote the time interval of the form [τl−1, τl)which contains t; It ∈ {1, 2, ..., r + 1}. For notational convenience, we denote byi1 and i2 the states occupied by an individual at two consecutive follow-up times t1and t2. Then, via the Chapman-Kolmogorov equation (1), the contribution of thisobservation to the modified likelihood can be written as

pi1i2{t1, t2 | z∗(t), t1 ≤ t < t2} =∑

k1∈S∑

k2∈S ...∑

kv∈S [ p(It1)

i1k1{τIt1 − t1 | z∗(t1)}

× p(It1+1)

k1k2{τIt1+1 − τIt1 | z∗(τIt1 )} × · · ·

× p(It2 )

kvi2{t2 − τIt2−1 | z∗(τIt2−1)} ],

(14)where v = It2 − It1 .The full likelihood function is obtain as before, and maximumlikelihood estimates are computed by maximizing the likelihood function using thescoring procedure.

8

Page 9: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

3.2 Hypothesis testing

When examining panel data one is often interested in testing hypotheses about themodel. One might, for example, start the modelling process including all possibletransitions and work toward a simpler sub-model by testing hypotheses of the formH0 : qij = 0 or H0 : qij = qhk. There are several possibilities for constructing a testof H0 against the general alternative. Kay [3] used an application of Wald’s test,which does not require recomputation of the maximum likelihood estimate under H0.But when the parameters are near the boundary of the parameter space the normalapproximation is unreliable. Another possibility for constructing a test of H0 againstthe general alternative is to use likelihood ratio tests [2].

Hypotheses of the form H0 : qij = qhk can be tested by fitting unrestricted andrestricted models and using likelihood ratio tests. The likelihood ratio statistic (thatis, twice the log-likelihood for the unrestricted model minus twice the log-likelihoodfor the restricted model) has an approximately κ2

1 distribution if H0 is true.Hypotheses of the form H0 : qij = 0 are somewhat more problematic as the

transition intensities are restricted to be non-negative. Hence we are testing whethera parameter lies on the boundary of the parameter space. In this situation standardlikelihood theory does not apply. However, as indicated by Self and Liang [8], thelikelihood ratio test for the hypothesis H0 : qij = 0 has an asymptotic distributionthat is a mixture of a point mass at zero and a κ2

1 distribution. Testing whetherseveral of the transition parameters are simultaneously zero is more difficult. Thissituation can be avoided to some extent by testing the parameters sequentially.

When using Markov model with covariates, a main interest is in testing hypoth-esis about the regression coefficients βij,k. In particular, it is interesting to test ahypothesis of the form H0 : βij,k = 0. Indeed, this hypothesis represents the factthat there is no relationship between the transition from state i to state j and the kth

covariate value. A likelihood ratio test was used to test if regression coefficients arestatistically different from zero. The statistic has an approximately κ2

1 distributionif H0 is true. As before, if H0 is true, it is possible to work with a simpler sub-modelwhere there is no relation between the transition from state i to state j and the kth

covariate value (βij,k is taken to be egal zero). In fact the regression coefficients canbe interpreted similarly to those in the proportional hazards regression model [6].It is of great interest to study the relationship between the different covariates andthe disease evolution, and to demonstrate the influence of covariates on the diseaseevolution.

9

Page 10: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

3.3 Diagnostics for model Assessment

3.3.1 Assumption of time homogeneity

An important assumption of the models illustrated in previous sections is that thetransition rates remain constant through time. In the literature there are principallytwo methods to assess assumption of time homogeneity. The first method is viathe piecewise model illustrated in section 2.3. The time axis can be divided intodisjoint periods, with time-homogeneous models fitted to each period separately. Inthis case a likelihood ratio test can be used to compare the piecewise model withthe time-homogeneous model. Under H0 (homogeneous model) the test statistichas approximately a κ2

k−q (q numbers of parameters under H0 and k numbers ofparameters under H1). This approach needs to fit the piecewise model.

A second method for examining departures from time homogeneity is via a localscore test [9, 10]. For any transition qij(t) with i = j we can, for example, consideralternative hypotheses such as H1 : qij(t) = qij + tα versus H0 : qij(t) = qij. The teststatistic is the ratio of the partial derivative of the log-likelihood with respect to α,evaluated at (qij0, βij, α = 0), and an estimate of its standard deviation. Under H0

the test statistic has approximately a N(0, 1) distribution. This method is discussedmore fully in Kalbfleisch and Lawless [1]. The advantage of this method is that onlythe time-homogeneous model has to be fitted.

3.3.2 Goodness of fit

In many cases one may attempt to address goodness of fit by comparing observed withexpected values based on a model. If there were time points at which all individualswere observed then such a test could be performed using the expected and observedcounts at these points [1]. For many studies, including that on asthma control,this condition does not hold. We can, however, calculate approximate observed andexpected counts. In order to compute the approximate observed and expected countswe assume that disease onset corresponds to first consultation time and that anindividual not actually observed at time t is assumed to have remained in the samestate as at his/her preceding consultation time. To be included, an individual musthave entered the study and still be enrolled in the study. When there are absorbingstates, individuals who have entered absorbing states are still under observation untilthe end of the study. Observed count for state u, Ou(t), is then the number ofindividuals in state u at time t. We denote by Eiu(t), the expected counts for state ufrom state i, it is the product of the number of individuals under observation at timet which were in state i at the first consultation and the transition probability piu(t)

10

Page 11: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

(i, u = 1, ..., k). We obtain expected counts

Eu(t) =k∑

i=1

Eiu(t). (15)

Although the use of statistical tests on the observed and expected counts is not strictlyjustified, however the empirical method using the value

M(t) =k∑

u=1

(Ou(t)− Eu(t))2

Eu(t), (16)

can be useful to check the agreement between observed counts and expected counts.Moreover, we can note that Aguirre-Hernandez et al. [14] developed a goodness of fittest for stationary and continuous time Markov models.

3.3.3 Other checks

Other checks may be of interest, indeed, the model in Section 2 makes several quitespecific assumptions about the disease process like the Markov assumption (tran-sition times from each state are independent of the history of the process prior toentry to that state) or assumption of homogeneity of the qij parameters across thepatient population. Such assumptions may be checked, by including covariates in themodelling process. Such methods have been considered by Kay [3].

4 APPLICATION ON ASTHMA CONTROL

4.1 The data and model

Currently, in France, asthma has a prevalence of 5% for adults and 10% for childrenand this is increasing and the one-year mortality from asthma is 4 per 100,000. InFrance around six people die of asthma every day and the disease affects more 3.5million people in France and more than 17 million in the United States. Asthma is amajor public health problem.

The follow-up study of asthmatic patients was conducted in France between 1997and 2001 by ARIA (Association pour la Recherche en Intelligence Artificielle). Dataused in this study were collected over a 4-year period by a number of French chestphysicians. All patients were prospectively enrolled adult asthmatics. They had beendiagnosed for at least a year, diagnosis confirmed by American Thoracic Society cri-teria and were treated according to international recommendations [22]. Only thosewith persistent asthma and at least 2 visits were included in the study. The database

11

Page 12: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

reflects the real activity of hospital, e.g patients returning at variable intervals ac-cording to their perceived needs. 459 patients were included in this study between1997-2000, and were followed up until 2001. The average length of follow-up was385 days and the median was 183 days (interquartile range = [91; 452]). A total of1682 consultations occurred during the study period. There were no deaths in thisdatabase reflecting, perhaps, the moderate amount of total exposure (459 patientsyear) and the quality of follow-up. Asthmatics in this database are living far longerthan the general public.

At each visit, a chest physician measured covariates and graded asthma usingthe concept of control scores [20, 21, 23]. Such scores are based on the frequency ofsymptoms, their duration, the degree of bronchial obstruction and the need for rescuemedication. The notion of control grade for each visit was used to define the subject’sstate at the time of the visit: optimal control (State 1), sub-optimal control (State2) and unacceptable control (State 3).

To analyse these data, we consider a three-state Markov model. This modelincludes three transient disease states (control states) and all transition between statesare allowed. Assuming that the underlying process is a Markov process, we representthis model using the transition intensity matrix Q as −(q12 + q13) q12 q13

q21 −(q21 + q23) q23q31 q32 −(q31 + q32)

, (17)

or the pattern described in Figure 1.

[Figure 1 about here]

4.2 Results

At the beginning of the study, the patients were distributed among the three stagesof asthma control as 18, 22 and 60 per cent, respectively. The distribution at theend of the study period was 37, 28 and 35 per cent, respectively. Note that theseprobability distributions do not correspond to a fixed period of time for each subject,so they are not valid information for estimating transition probabilities.

The purpose of this section is to illustrate the methodology developed in the earliersections and to present an analysis of the data. Firstly, an homogeneous Markovmodel without covariates was used to model asthma evolution using a computerprogram. Table 1 shows the estimates of transition intensities and their standarderrors. The log-likelihood for the time-homogeneous model without covariates wasequal to −1238, 2 (results not shown). Secondly, a nonhomogeneous Markov modelwith piecewise constant intensities was used. Time is measured from the beginning

12

Page 13: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

of the follow-up. A single artificial time-dependent covariate z∗1(t), with cut point atτ1 = 240 days (8 months), was used to fit a Markov model with piecewise constanttransition intensities. For each transition, the intensity is assumed to be constantover the time interval 0 ≤ t < τ1 and also constant over t ≥ τ1. The estimates ofthe baseline intensities, of the regression coefficients and of their standard errors aredisplayed in Table 2. Moreover, Table 2 shows the p-values using the likelihood ratiotest for testing βij = 0. All the coefficients are tested here to be different from zero,it means that there is a relation between any transitions and the time elapsed fromthe beginning of the process. The log-likelihood for the piecewise model, where allregression coefficients equal to zero, was −1285, 6. In order to test the assumption oftime homogeneity, the likelihood ratio statistic for testing this two-piecewise constantmodel (H1) against the time homogeneous model (H0) in six degrees of freedom,yielding a p-value < 0.01, which indicates a much better fit of the two-piecewiseconstant model. This result indicated that a non-homogeneous model is suitable forthis data because the assumption of time homogeneity is too restrictive. However, abetter fit of the piecewise model was understandable because the model fits the databetter (and many periods will lead to a better fit).

[Table 1, Table 2 about here]

Now, a single-covariate Markov model was used to assess the individual effectsof factors associated with asthma. The model with six regression coefficients wasfitted to various factors available in the database (Age of the subject, asthma sever-ity, body mass index, duration of asthma, number of exacerbations, sexe, smoking,therapeutic observance). Each single-covariable model was compared with the basicmodel without covariates using the likelihood ratio test. The asthma severity, bodymass index and number of exacerbations were the most associated factors with tran-sitions of asthma control. Indeed, all other factors were not significantly associatedwith changes in asthma control. Then we were interested in the three statisticallysignificant factors which are time-dependent and binary covariates:

• asthma severity: patients with mild-moderate asthma (encoded by 0) or patientswith severe asthma (encoded by 1).

• body mass index (BMI): patients with BMI ≤ 25 (encoded by 0) or patientswith BMI > 25 (encoded by 1),

• number of exacerbations between two consultation time: patients without exac-erbations (encoded by 0) or patients with one and more exacerbations (encodedby 1).

13

Page 14: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

[Table 3 about here]

Table 3 shows the estimates of coefficients, their standard errors and the p-valuesusing the likelihood ratio test with one degree of freedom for testing βij = 0. Forthe severity covariate, the results suggested, that only the regression coefficient fortransition 1 → 3 is not significantly different from zero (p = 0.47). The regressioncoefficient for transition 2 → 3 is positive, which means that the transition fromState 2 to State 3 is accelerated for patients with severe asthma. On the other hand,β31 is negative, which means that the transition from State 3 to State 1 is reducedfor patients with severe asthma. As regards the BMI, the results suggested thatall the intensities are the highest for patients with BMI ≤ 25 and decrease non-significantly for transitions 1 → 2, 1 → 3, 2 → 1, 2 → 3, 3 → 2, and significantlyfor transition 1 → 2 (p < 0.001). The coefficient β31 is negative, that means thetransition from State 3 to State 1 is reduced for patients who have BMI > 25. Asregards the number of exacerbations, the coefficients β13 and β31 are not significantlydifferent from zero. For example, β32 is negative (β32 = −6.72), which means that thetransition from State 3 to State 2 is highly reduced for patients with exacerbation.The fact that some regression coefficients are not significantly different from zerosuggested that these regression coefficients can be constrained to be equal to zero,so that the coefficients do not need to be estimated. We have also compared theobserved and expected counts, for each models, with a view to assess goodness offit. The results (not shown) seemed to indicate that the models could be seen asfitted to the data. In order to interpret the estimation results, the curves over time oftransition probability of mild-moderate asthma patients and of severe asthma patientsare also provided (Fig2, Fig3). Indeed, these graphs show the evolution over timeof various transition probabilities. On the one hand, these graphs suggested that theprobability of staying in State 3 at time t = 400 days is higher for patients withsevere asthma, on the other hand the probability of staying in State 1 at time t = 400days is higher for patients with mild-moderate asthma. The probability curves forthe covariates BMI and exacerbation are not shown but are also of interest. Theseprobability curves are useful for analysing the probability behaviour through the timeand for informing patients about their possible disease evolution. As mentioned inthe ”The incorporation of covariates” sub-section, the assumption that the covariatesdo not change between assessments is stringent. We used a window during which thepatient drops out of the study until a new assessment occurs. We used different valuesfor the length of the window (3, 4, 5, 6 months) to see how sensitive any conclusionsare. The results obtained with various length, are not significantly different from theresults obtained with the assumption that the covariates remain constant betweenassessments (results not shown).

[Figure 2, Figure 3 about here]

14

Page 15: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

5 DISCUSSION

In this paper, we have illustrated the usefulness of time-homogeneous Markov modelsin the analysis of follow-up studies of disease. We have implemented methods intro-duced by Kalbfleisch and Lawless [1] that allow quite general models to be fitted. Inparticular we have used an homogeneous Markov model with covariates. We havediscussed a simple approach based on using appropriate time-dependent covariates ina modified homogeneous Markov model to fit Markov model with piecewise constantintensities. We have also summarized hypothesis testing procedures, method to testassumption of time-homogeneity and method to assess adequacy of the models. Thenwe have discussed the application of these results to a database on asthma and wehave demonstrated the usefulness of Markov model to predict asthma evolution.

The results of the multi-state models have confirmed much of what is known aboutthe natural course and the factors affecting asthma. However, using Markov modelswe have learned more about how the different factors affect the disease process overtime. As many studies have shown, asthma severity is one of the most importantfactors associated with the rate of progression among the different stages of asthmacontrol. BMI and the number of exacerbations are also important factors associatedwith asthma evolution. Moreover all the results using one covariate model are con-firmed when we used a model with two covariates (results not shown), which meansthat the BMI and exacerbations remained significant even after adjusting severityin the model. The use of a piecewise model allows us to assess assumption of timehomogeneity and to consider easily a non-homogeneous model for asthma evolution.Furthermore the piecewise constant model can provide interesting results in manydisease studies. This study suggested, firstly, that continuous-time Markov modelcan be applied to asthma, and secondly, that the introduction of covariates (time-dependent or not) is very important to allow precision of probabilities tailored toindividual patients. Moreover this study provided a tool for clinical evaluation, forpatients information and for evaluations of management strategies.

However, there are limitations with the use of time-homogeneous Markov mod-els. It is not always clear how best to choose states and the assumptions that theprocess is Markov and time homogeneous are very restrictive. However, the modelassessment techniques described in section 3 are useful to check these assumptions.We encountered others limitations with the use of Markov model with covariates. Itcould be interesting to use a model with three or more covariates, but the estima-tion implementation becomes quickly impossible. Indeed, the number of parametersincreases proportionately to the number of covariates, eighteen parameters for twocovariates and twenty-four for three covariates in the asthma case. To overcome thisdifficulty, one may use some simplified versions of the models [5, 12].

In the results, we have found that piecewise model is suitable for asthma, it could

15

Page 16: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

be interesting in order to carry on the study of this asthma database to consider anon-homogeneous piecewise model with covariates. It is also interesting to develop atwo-order Markov model to take the subject past into account. Higher order modelsseem difficult to compute because the number of parameters increases exponentiallywith order. Finally, Markov models provide an interesting tool to model asthmaevolution and the introduction of fixed or time-dependent covariates are importantin asthma and in most of disease history studies.

ACKNOWLEDGEMENTS

The authors thank the two anonymous referees for their many helpful commentsand valuable suggestions. This research was supported by a grant (4AS04F) fromINSERM (Institut National de la Sante et de la Recherche Medicale), FRANCE andby ARIA (Association pour la Recherche en Intelligence Artificielle).

16

Page 17: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

REFERENCES

1. Kalbfleisch JD, Lawless JF. The analysis of panel data under a Markov assump-tion. Journal of the American Statistical Association 1985; 80:863-871.

2. Gentleman RC, Lawless JF, Lindsey JC, Yan P. Multi-state Markov modelsfor analysing incomplete disease history data with illustrations for HIV disease.Statistics in Medicine 1994; 13:805-821.

3. Kay R. A Markov model for analysing cancer markers and disease states andsurvival studies. Biometrics 1986; 42:855-865.

4. Longini IR, Clark WS, Byers RH, Ward JW, Darrow WW, Lemp GF, HethcoteHW. Statistical analysis of the stages of HIV infection using Markov model.Statistics in Medicine 1989; 8:831-843.

5. Marshall G, Jones RH. Multi-state models and diabetic retinopathy. Statisticsin Medicine 1995; 14:1975-1983.

6. Cox DR. Regression models and life tables (with discussion). Journal of theRoyal Statistical Society, Series B 1972; 34:187-220.

7. Cox DR, Miller HD. The theory of stochastic processes. Chapman & Hall:London, 1965.

8. Self S, Liang K-Y. Asymptotic properties of maximum likelihood estimatorsand likelihood ratio tests under nonstandard conditions. Journal of AmericanStatistical Association 1987; 82:605-610.

9. Kalbfleisch JD, Lawless JF. Some statistical methods for panel life history data.Proceedings of the Statistics Canada Symposium on the Analysis of Data inTime, Ottawa, Statistics Canada, 1989, pp. 185-192.

10. de Stavola BL. Testing departures from time homogeneity in multistate Markovprocesses. Applied Statistics 1988; 37:242-250.

11. Satten, GA. Estimating the extent of tracking in interval-censored chain-of-events data, Biometrics 1999; 55: 1228-1231.

12. Cook, RJ, Lawless, JF, Yi, GY. A generalized mover-stayer model for paneldata, Biostatistics 2002; 3: 407-420.

17

Page 18: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

13. Cook, RJ. A mixed model for two-state Markov processes under panel observa-tion, Biometrics 1999; 55: 915-920.

14. Aguirre-Hernandez, R, Farewell,VT. A Pearson-type goodness-of-fit test for sta-tionary and time-continuous Markov regression models, Stat Med 2002; 21:1899-1911.

15. Lindsay, JC, Ryan, LM. A three-state multiplicative model for rodent tumori-genicity experiments, Applied Statistics 1993 ; 42: 283-300.

16. Korn, EL, Whittemore, AS. Methods for analyzing panel studies of acute healtheffects of air pollution, Biometrics 1979; 35: 795-802.

17. Jain, S. Markov chain model and its application, Comput Biomed Res 1986; 19:374-378.

18. Ware, JH, Lipsitz, S, Speizer, FE. Issues in the analysis of repeated categoricaloutcomes, Stat Med 1988; 7: 95-107.

19. Redline, S, Tager, IB, Segal, MR, Gold, D, Speizer, FE, Weiss ST. The relation-ship between longitudinal change in pulmonary function and nonspecific airwayresponsiveness in children and young adults, Am Rev Respir Dis 1989; 140:179-184.

20. Boudemaghe, T, Daures, JP. Modeling asthma evolution by a multi-state model,Rev Epidemiol Sante Publique 2000; 48: 249-255.

21. Combescure, C, Chanez, P, Saint-Pierre, P, Daures, JP, Proudhon, H, Godard,P. Variations in control of asthma over time can be assessed by a Markov model,European Respiratory Journal - in press.

22. National Institutes of Health. Expert Panel Report 2: Guidelines for the diag-nosis and management of asthma. Bethesda: NIH/National Heart, Lung andBlood Institute, 1997. NIH publication number 97-4051.

23. Juniper EF, O’Byrne PM, Guyatt GH, Ferrie PJ, King DR. Development andvalidation of a questionnaire to measure asthma control. European RespiratoryJournal 1999; 14(4):902-907.

18

Page 19: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

TABLES

Baselineintensity

Estimate (sd)1

q12 (1→2)q13 (1→3)q21 (2→1)q23 (2→3)q31 (3→1)q32 (3→2)

14.3×10−3 (0.8×10-3)

3.3×10−3 (0.7×10-3)

13.4×10−3 (0.4×10-3)

4.2×10−3 (0.3×10-3)

4.4×10−3 (0.2×10-3)

3.7×10−3 (0.2×10-3)1 standard error

Table 1 : Estimates of baseline intensities with standard errors for homogeneousMarkov model without covariates.

Baselineintensity

Estimate (sd)1Regressioncoefficient

Estimate (sd)1(p-value)2

q120 (1→2)q130 (1→3)q210 (2→1)q230 (2→3)q310 (3→1)q320 (3→2)

19.4×10−3 (5.1×10-3)

6×10−3 (10-3)

31.3×10−3 (7.8×10-3)

0.2×10−3 (1.1×10-3)

10.2×10−3 (1.2×10-3)

2.4×10−3 (0.9×10-3)

β12 (1→2)β13 (1→3)β21 (2→1)β23 (2→3)β31 (3→1)β32 (3→2)

-1.11 (0.294) (< 0.01)

-1.34 (0.067) (< 0.01)

-2.08 (0.274) (< 0.01)

3.1 (0.101) (< 0.01)

-3.23 (0.225) (< 0.01)

0.56 (0.748) (< 0.01)1 standard error

2 p-value using likelihood ratio test for testing βij = 0

Table 2 : Regression coefficients, baseline intensities estimates with standard errorsand p-values for the piecewise constant Markov model (two periods).

19

Page 20: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

Coefficient Severity BMI ExacerbationEstimate (sd)1(p-value)2 Estimate (sd)1(p-value)2 Estimate (sd)1(p-value)2

β12 (1→2)β13 (1→3)β21 (2→1)β23 (2→3)β31 (3→1)β32 (3→2)

-1.24 (0.367) (< 0.01)

-0.18 (0.141) (0.47)

-1.79 (0.354) (< 0.01)

4.78 (0.181) (< 0.01)

-1.84 (0.193) (< 0.01)

2.81 (0.177) (< 0.01)

-0.41 (0.254) (0.02)

-0.14 (0.078) (0.6)

-0.31 (0.231) (0.04)

-0.16 (0.071) (0.49)

-1.38 (0.139) (< 0.01)

-0.28 (0.161) (0.064)

2.51 (0.124) (< 0.01)

-0.15 (0.082) (0.87)

1.68 (0.198) (< 0.01)

0.96 (0.085) (< 0.01)

-0.12 (0.223) (0.38)

-6.72 (0.237) (< 0.01)

q120 (1→2)q130 (1→3)q210 (2→1)q230 (2→3)q310 (3→1)q320 (3→2)

25.3×10−3 (8.5×10-3)

6.5×10−3 (1.7×10-3)

28.4×10−3 (9.5×10-3)

0.1×10−3 (1.5×10-3)

15.8×10−3 (2.8×10-3)

0.2×10−3 (2.2×10-3)

15.7×10−3 (2.6×10-3)

4.1×10−3 (0.8×10-3)

14.6×10−3 (2.3×10-3)

4.5×10−3 (0.7×10-3)

7.9×10−3 (0.8×10-3)

4.4×10−3 (0.6×10-3)

9.5×10−3 (1.1×10-3)

2.8×10−3 (0.35×10-3)

11.1×10−3 (1.2×10-3)

3.1×10−3 (0.36×10-3)

5.6×10−3 (0.4×10-3)

4.7×10−3 (0.37×10-3)

1 standard error2 p-value using likelihood ratio test for testing βij = 0

Table 3 : Regression coefficients, baseline intensities estimates with standard errorsand p-values for homogeneous Markov model with covariate.

20

Page 21: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

FIGURES

State 1

Optimalcontrol

State 2

Sub-optimalcontrol

State 3

Unacceptablecontrol

q23

q32

q21

q12

q13q31

Figure 1: The multi-state Markov model for asthma with three states defined byasthma control.

21

Page 22: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

P21(t)

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

P22(t)

Time (days)

Tra

nsitio

npro

babili

ties f

rom

sta

te 2

0 100 200 300 400

0.0

0.2

0.4

0.6

0.8

P23(t)

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

P31(t)

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

P32(t)

Time (days)

Tra

nsitio

npro

babili

ties f

rom

sta

te 3

0 100 200 300 400

0.0

0.4

0.6

0.8

1.0

P33(t)

0 10 20 30 40

P11(t)

0 10 20 30 40

P12(t)

Time (days)

Tra

nsitio

npro

babili

ties f

rom

sta

te 1

100 200 300 400

0.0

0.2

0.4

0.6

0.8

1.0

P13(t)

Figure 2: The curves of various transition probabilities over time for patients withmild-moderate asthma.

22

Page 23: The analysis of asthma control under a Markov …The analysis of asthma control under a Markov assumption with use of covariates 1 P. Saint-Pierre†∗, C. Combescure†, JP.Daur`es†

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

P11(t)

0 100 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

P12(t)

Time (days)T

ransitio

npro

babili

ties f

rom

sta

te 1

0 200 300 400

P13(t)

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

P21(t)

0 100 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

P22(t)

Time (days)

Tra

nsitio

npro

babili

ties f

rom

sta

te 2

200 300 400

0.0

0.2

0.4

0.6

0.8

1.0

P23(t)

0 10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

P31(t)

0 100 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

P32(t)

Time (days)

Tra

nsitio

npro

babili

ties f

rom

sta

te 3

200 300 400

P33(t)

Figure 3: The curves of various transition probabilities over time for patients withmild-moderate asthma.

23