air pollution effects on clinic visits for lower respiratory illness 黃景祥 中央研究院...
TRANSCRIPT
Air pollution effects on clinic visits for lower respiratory
illness
黃景祥
中央研究院詹長權 台灣大學
Outline Introduction to air pollution and healt
h The study objective and design Environment and health data Statistical models Main findings Discussion
Development of modern research
The potential for air pollution at high concentrations to cause excess deaths was established in the mid-twentieth century by a series of air pollution “disasters” in the US and Europe which caused striking increases in mortality.
Development of modern research By the early 1990's, time series
studies, each conducted at a single location, showed that air pollution levels, even at much lower concentrations, were associated with increased rates of mortality and morbidity in cities in the United States, Europe and other developed countries.
Development of modern research
At present, although these relative rates are small, the burden of disease attributable to air pollution may be substantial considering the very large population exposed to air pollution and to whom the relative rates of mortality or morbidity apply.
Development of modern research Investment in research programs designed
to answer some of important questions Powerful tools are available for data
collection and analysis Far more data available Understanding of the power and
limitations of statistical methods Such an interesting challenge that many
disciplines are involved in its full understanding
Exposure assessment Target organ dose
Less easy to estimate organ dose Personal exposure models
Pollutant concentration and time activities Enhanced by other factors: exercise, smoking, viral i
nfections Microenvironmental models
Indoor/outdoor concentrations with time-activity data
Ambient air quality monitoring data Less accurate
What is a health effect? Minor changes in respiratory function
and bronchial activity Increases in respiratory symptom
prevalence and incidence Acute asthma attack, exacerbations of
bronchitis, wheezing, serious illness e.g. cancer, hospital admissions for acute asthma or bronchitis
Deaths
Time scales of exposure-effect Acute health effects (minutes to months)
Inflammatory cells in the lung Deaths from respiratory and cardiac diseases
Chronic health effects (months to years) Increased prevalence of cough, wheeze,
asthma, bronchitis Long-term inflammatory changes in bronchial
walls Increased incidence of lung cancer Increased mortality from cardio-respiratory
diseases
Health effects studies Experimental studies
In vivo exposure in animals In vitro exposure of human or animal
tissue or bacterial cultures Controlled-chamber experiments
Under controlled conditions on dogs or human volunteers
Establish a dose-response relationship
Epidemiological studies Short-term studies
Ecological studies: examines the effects of day-to-day changes in air pollution levels on routinely measured health outcomes such as hospital admissions
Panel studies: on panels of individual volunteers
Reflect real-life exposure conditions Usually not possible to infer causality
Epidemiological studies Long-term studies
Cross-sectional studies: the prevalence of disease in different communities is compared with the ambient level of pollution in those communities.
Cohort studies: follow up a group over a period of time
Large sample size is required The effect of confounding factors and problem
of estimating exposure over the whole latent period
Epidemiological studies Extensive application to air pollution
Because of large degree of variation of air pollution levels over time and across geographic areas
Inexpensive database Monitoring networks for regulatory
objectives Routinely collected mortality and morbidity
statistics by government and insurance agency
Epidemiological studies Time-domain methods to demonstrate
associations between air pollution and various health effects in single cities.
Two common features1. Mainly carried out in places with a large
population. 2. Aggregate data in a large area to
represent population exposures. Misclassification is often compounded.
Possible solutions Create less heterogeneous exposures
by clustering hospitals around a monitoring station as suggested by Burnett et al.
Exposure attribution based on clustered hospitals remains a serious challenge because some hospitals are located as far as 200 km away from any monitoring stations.
Possible solutions Known census clusters will provide exposur
e populations with smaller and more homogeneous regions (Zidek et al.).
Many important explanatory factors are either unmeasured or unavailable in all clusters.
Census areas are not equivalent to clinic catchment areas.
Daily outcomes in small census subdivision are sparse when the health outcome is the case for serious illness.
Small area design Cluster clinics around a monitoring station
to create relatively homogeneous area of size about 20 km2. Population at risk of each area is the estimated
service coverage of all clinics in that area. Population exposure is represented by
measurements from the monitoring station. Health outcome is daily clinic visit for lower
respiratory illness.
Objective Use daily pollutant levels and clinic
visits for lower respiratory illness data recorded in 50 small areas to estimate air pollution health effect.
Statistical Analysis Estimate population at risk for each area and
convert daily clinic visit counts to daily rates.
Phase I: Use linear models to model temporal patterns in order to obtain estimated pollution-health effect for each area.
Phase II: Use Bayesian hierarchical models to combine the estimated pollution-health effects across the 50 communities.
The Data
Study communities include 50 townships and city districts across the island
Include rural, urban and industrial areas
Population densities range from 250 to 28,000 persons/km2
The Data Environmental variables
Daily average for NO2, SO2 and PM10
Daily maximum O3 and maximum 8-hour running average for CO
Daily maximum temperature and average dew point
The Data Clinic Visits
Huge computerized clinic visit records contain clinic's ID, township names, date-of-visit, patient's ID, gender, birthday, cause-of-visit and others.
One-year records from the 50 study communities in 1998.
Clinic visits due to lower respiratory illness like acute bronchitis, acute bronchiolitis, and pneumonia are used as health effects.
Classify the population at risk into 3 age groups: children (0-14), adults (15-64) and elderly (65+).
Data Summary Estimated population at risk ranged from 19,
000 to 278,000. The averages of daily average NO2, SO2, PM10,
and CO levels were 23.6 ppb, 5.4 ppb, 58.9 , 1.0 ppm, and daily maximum O3 levels 54.2 ppb.
The average of daily rates of clinic visits due to lower respiratory illness was 1.34 per 1000. The average rates are 2.39, 0.88 and 1.02 the child
ren, adults and elderly groups, respectively.
3/mg
Area
NO2
(ppb
)
1525
01 02 04 05 06 07 08 09 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
Area
SO2
(ppb
)
510
01 02 04 05 06 07 08 09 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
Area
4060
80
01 02 04 05 06 07 08 09 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
Area
O3 (p
pb)
4060
01 02 04 05 06 07 08 09 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
Area
CO (p
pm)
0.6
1.2
01 02 04 05 06 07 08 09 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
Area
clinic
visit
children
24
68
01 02 04 05 06 07 08 09 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
Area
clinic
visit
adults
0.5
1.5
01 02 04 05 06 07 08 09 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
Area
clinic
visit
elderly
12
3
01 02 04 05 06 07 08 09 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
Area
clinic
visit
all ages combined
1.0
2.5
4.0
01 02 04 05 06 07 08 09 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
Population at Risk Define population at risk for a selected
community as those who would go to the clinics in the community whenever they need to make medical visits, which is the service coverage of all the clinics in the community.
Include some non-resident daytime workers who may visit clinic in the community, but exclude residents who prefer to use medical resources outside the community.
Population Estimation Similar to estimating the number of unseen
species in ecological studies, using only the numbers of individuals captured during a fixed interval of time.
Use clinic visits due to all diseases recorded in the study communities during 1998 to estimate population at risk.
An individual's times of clinic visits in a community during one year is analogous to a species having members captured during one unit of time.x
x
Population Estimation For the species problem, the members
are assumed unrelated, while one person's clinic visits are generally correlated.
Assumption may still be satisfied when we only count the first visit for consecutive visits with same diagnosis in a short time period.
x
x
Let be the number of people having exactly clinic visits in a community during 1998.
is the total number of different people having made at least one clinic visit in that community in 1998.
The number of people who made no clinic visits in 1998 but would do so if they were later sick is .
0n
1x xn
xn x
Assume that all people will eventually get sick and visit one of the clinics in this community in the coming years.
The expected number of is denoted by in unseen species problem.
Efron and Thisted (Biometrika,76) proposed
with , where B is
.
x
x
x xnht 0
1)(ˆ
0n
)(t0n
)(Pr)1( 1 xBth xxx
t
Population Estimation Ideally, one should choose an appropriate
value to obtain less biased population estimation without excess uncertainty.
Our choice of is based on the observation that Patient's medical seeking behavior was stable
under the NHI program
Limited changes in the demographics of study communities in the past six years in Taiwan.
5t
t
Population Estimation Validity of the population estimator
We estimated the number of people not recorded in the database of 1997 but who appeared in 1998.
Mean absolute value of the relative difference between estimated additional subjects, , and actually observed new patients in 1998 was less than 2% across study communities.
)1(
Phase I modeling Use daily visit rate in log scale instead of
count as response variable. Daily series of rates for each sub-population
by area and age group are modeled separately.
Our models are general linear regressions with seasonal autoregressive moving average residual processes.
The regression terms/confounding variables were chosen through extensive exploratory data analyses.
The Model:
where yiat is the observed clinic visit rate of the ath age group in the ith community at the tth day.
POLLi, t-h is the level of pollutant at day t-h, where t is the current day and h ranges from 0 to 2.
is the pollution coefficient. The error term
,POLL
DP3DEWTP3
TL32TG32SUMWIN
SHSATMONSUN)(log
,
11109
8765
43210
iahthtiiah
itiahitiahitiah
itiahitiahiahiah
iahiahiahiahiahiat
W
y
iah7)0,0,1()0,0,1(SARIMA~ iahtW
Model Selection The model was examined at several
communities with a mean R-squared = 0.53 in fitting the data of all the sub populations.
Ideally, we can explore the data to find the best models for each setting of the combination of 5 air pollutants, 3 time lags, and 4 age categories in all 50 locations, respectively.
Because of efficiency considerations we apply this single regression model to all sub-populations in all 50 locations at this phase.
Health impact is measured as the percentage increase in clinic visit rates that corresponds to a 10% increase in local air pollution levels.
The percentage change is expressed by , where is the estimated pollution coefficients for community i age group a, and lag h, and is the corresponding average pollution level.
The 95% confidence interval for the percentage change is constructed by replacing with
, where is the standard error.
}1)ˆexp(0.1{100 iahihX iah
ihX
iah
iahiah ˆ2ˆ iah
Phase II modeling The second phase of hierarchical
modeling is to use variables of community's characteristics and spatial dependency
To modify pollution coefficient estimate in each location,
To obtain an overall pollution coefficient estimate across multiple locations.
Three stages: First, the estimated 50 pollution coefficient
s for a single pollutant, a fixed age group and time lag, denoted as are assumed to be multivariate normal, that is
where and ,and is the estimate of standard error of .
),(N~ˆ50
)',( 501 )ˆ,,ˆdiag( 2
5021 i
i
)'ˆ,,ˆ(ˆ501
Second, spatial variation among the 50 mean pollution coefficients is modeled as
where dij is Euclidean distance between the air monitoring stations for communities i and j, and R is a range parameter.
Based on empirical correlograms for the 50 estimated pollution coefficients, the range parameter R is fixed at 5 km.
iqiqii ZZ 110
}/{exp),( 2 RdCov ijji
For the current study, we construct the regression terms
The intercept can be interpreted as an overall pollution coefficient for any location with mean predictors.
The other coefficients, , reflect the modification or adjustment on its local pollution coefficient ( )
ii COOPMSONOTPDZ ]s,s,s,s,s,s,s[ 31022
0
71 ,,
i
Third, complete the hierarchical structure with a proper prior model
for and We use conjugate priors, and
.
The hyper parameters, , in our model are chosen to reflect no information on and .
2
),(~ CN ),(~2 baIG
baC ,,,
2
The Bayesian inference is based on the posterior distribution of and given the Phase I estimates and the specified hyper parameters.
Samples from these posteriors can be obtained from the MCMC algorithm, or simply use BUGS software.
, 2,
Results – Phase I Variation in clinic visits was likely related to variati
on in NO2, CO, SO2 and PM10 exposures. No significant effect for ozone exposures.
Significant association was seen at current day but less significant at 1-day lag.
Significant intra-community and inter-community variability in the estimated percentage changes of clinic visit rates.
1 2 4 5 6 7 8 9 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
-20
24
6-1
13
5
% in
cre
ase
in c
linic
vis
it ra
te
Area
Lag0
Phase I model for NO2 in all ages combined
1 2 4 5 6 7 8 9 10 11 12 13 14 15 17 20 21 22 23 24 26 28 29 30 31 32 33 36 37 38 39 40 42 43 44 45 46 48 50 51 52 53 54 55 56 58 59 60 65 69
-20
24
6-1
13
5
% in
cre
ase
in c
linic
vis
it ra
te
Area
Lag1
-2
02
46
-11
35
1 2 4 5 6 7 8 9 10
11 12
13
14
15
17
20
21
22
23
24
26
28
29
30
31
32
33
36
37
38
39
40
42
43
44
45
46
48
50
51
52
53
54
55
56
58
59
60
65
69
ove
rall
% in
crea
se in
clin
ic v
isit
rate
Area
Lag0Model for NO2 in all ages combined
-20
24
6-1
13
5
1 2 4 5 6 7 8 9 10
11 12
13
14
15
17
20
21
22
23
24
26
28
29
30
31
32
33
36
37
38
39
40
42
43
44
45
46
48
50
51
52
53
54
55
56
58
59
60
65
69
ove
rall
% in
crea
se in
clin
ic v
isit
rate
Area
Lag1
Results – Phase II The 95% posterior support intervals of the
estimated overall pollution coefficient ( ) showed that clinic visits were related to NO2, CO, SO2 and PM10 exposures but not O3.
An individual community's pollution coefficient for NO2 was negatively adjusted by long-term PM10 and O3 exposure.
0
Covariate
Coe
ffici
ents
-20
24
6
PD Temp NO2 SO2 PM10 O3 COOverall
NO2
Covariate
Coe
ffici
ents
-10
-50
510
PD Temp NO2 SO2 PM10 O3 COOverall
SO2
Covariate
Coe
ffici
ents
-1.0
0.0
1.0
PD Temp NO2 SO2 PM10 O3 COOverall
PM10
Covariate
Coe
ffici
ents
-50
050
100
PD Temp NO2 SO2 PM10 O3 COOverall
CO
Modification of acute effect The acute effect of SO2 was adjusted by
(-) area's population density, PM10 and SO2
(+) area's annual CO and O3 concentrations.
The acute effect of CO was adjusted by (-) area's population density, PM10 and O3.
The acute effect of PM10 was adjusted by (-) long-term exposure of PM10 (+) long-term exposure of CO positively.
Modification of acute effect In summary, area's annual PM10 level is a
major effect modifier. The short-term effects of air pollution on lower respiratory illness would be lower in areas with a large PM10 average.
Yearly averages of community's NO2 and SO2 levels, however, had no significant influence on the acute effects of the 5 pollutants in the Phase II models.
Main findings NO2 had the greatest estimated percentage
increases in daily clinic visit rates The pollution effects were always the
greatest for current-day exposures and decreased significantly as exposure time lags increased
The elderly being the most susceptible. The short-term effects of air pollution on
lower respiratory illness would be lower in areas with a large PM10 average.
Per
cent
cha
nges
0.0
1.0
2.0
Lag0
NO2 SO2 PM10 O3 CO
Ch
i.
Ad
u.
Eld
.
All
Ch
i.
Ad
u.
Eld
.
All
Ch
i.
Ad
u.
Eld
.
All
Ch
i.
Ad
u.
Eld
.
All
Ch
i.
Ad
u.
Eld
.
All
Per
cent
cha
nges
0.0
0.5
1.0
Lag1
NO2 SO2 PM10 O3 CO
Ch
i.
Ad
u.
Eld
.
All
Ch
i.
Ad
u.
Eld
.
All
Ch
i.
Ad
u.
Eld
.
All
Ch
i.
Ad
u.
Eld
.
All
Ch
i.
Ad
u.
Eld
.
All
Discussion Few epidemiologic studies have related
clinic visits of minor illness to ambient air pollution.
Studies on minor health effects of air pollution should be encouraged even though currently major on-going epidemiologic studies on air pollution are about mortality.
Discussion From scientific viewpoints, the studies
on minor health effects can strengthen consistency in the biological plausibility of mortality effects by air pollution.
From public health viewpoints, a minor health effect usually impacts on large-scale population and can lead to the death of susceptible population.
Discussion Population at risk estimation is an
important issue in environmental health studies.
High collinearity among air pollutants prevents us from using multi-pollutant models.
Discussion Gaussian linear process for rates versus Poi
sson process for counts
Linear predictors of these two models are the same except one constant term of population at risk in log scale;
A minor difference between these two models is the assumed variance structure;
Gaussian process provides us with flexible model selection, diagnostics and simplified computation.
Discussion Joint tempo-spatial models can fit
the multiple time series of rates data simultaneously.
However, model selection and calculations are challenges.
Some other challenging issues of epidemiologic studies on air pollution
Why the exposure-response slopes for individual air pollutants varied significantly among different study sites?
Whether the pollution effects were from single pollutant or mixtures of air pollutants?
What was the relationship between chronic and acute exposure effects?
Related works on air pollution health effects Proposed a subject-domain model for
estimating the schoolchildren’s risks of illness absence (Hwang et al., 2000)
On emergency room visit for respiratory disease in Taipei (Hwang and Lin, 2001)
Mortality in association with ozone and particles (Chiang and Hwang, 2001)