60939094
TRANSCRIPT
-
8/13/2019 60939094
1/11
ORIGINAL ARTICLE
Vertebral fracture risk (VFR) score for fracture prediction
in postmenopausal women
M. Lillholm &A. Ghosh &P. C. Pettersen &
M. de Bruijne &E. B. Dam &M. A. Karsdal &
C. Christiansen &H. K. Genant &M. Nielsen
Received: 25 January 2010 /Accepted: 2 September 2010 /Published online: 11 November 2010# International Osteoporosis Foundation and National Osteoporosis Foundation 2010
Abstract
Summary Early prognosis of osteoporosis risk is not only
important to individual patients but is also a key factor
when screening for osteoporosis drug trial populations. We
prese nt an osteo porosis fracture risk score based on
vertebral heights. The score separated individuals who
sustained fractures (by follow-up after 6.3 years) from
healthy controls at baseline.
Introduction This casecontrol study was designed to
assess the ability of three novel fracture risk scoring
methods to predict first incident lumbar vertebral fractures
in postmenopausal women matched for classical risk factors
such as BMD, BMI, and age.
Methods This was a casecontrol study of 126 postmeno-
pausal women, 25 of whom sustained at least one incident
lumbar fracture and 101 controls that maintained skeletal
integrity over a 6.3-year period. Three methods for fracture
risk assessment were developed and tested. They are based
on anterior, middle, and posterior vertebral heights mea-
sured from vertebrae T12-L5 in lumbar radiographs at
baseline. Each scores fracture prediction potential was
investigated in two variants using (1) measurements from
the single most deformed vertebra or (2) average measure-
ments across vertebrae T12-L5. Emphasis was given to the
vertebral fracture risk (VFR) score.
Results All scoring methods demonstrated significant sepa-
ration of cases from controls at baseline. Specifically, for the
VFR score, cases and controls were significantly different
(0.670.04 vs. 0.350.03, p
-
8/13/2019 60939094
2/11
Introduction
Postmenopausal osteoporosis remains a serious condition
affecting millions of individuals worldwide. Current epide-
miological evidence suggests that in industrialized
countries, approximately 40% of postmenopausal women
at the age of 60 and as many as 70% of women at the age of
80 suffer from osteoporosis [1]. Postmenopausal osteopo-rosis is characterized by a reduction in bone mass due to
increased bone resorption and a simultaneous but less
pron ounce d increase in bone formation, resu lting in
negative net calcium balance. This ongoing process fueled
by chronic estrogen deficiency may eventually lead to micro-
architectural osteoporosis, possible fractures, and substantial
deterioration in the quality of life. The cardinal feature of
osteoporosis is the occurrence of fragility fractures, typically
in the spine, but also in the forearm and hips. Whereas limb
fractures are easy to diagnose, the case is different for the
spine region, where mild vertebral fractures are often
asymptomatic. Though the mortality rate from osteoporoticfractures is the highest for those of the hip, vertebral fractures
are the most common type of fragility fractures with an
estimated occurrence of 750,000 cases per year in the USA
[2]. Osteoporotic vertebral fractures typically occur earlier
and are an established risk factor for hip fractures [3].
Presence of severe vertebral fractures has been associated
with acute and chronic pain, impaired quality of life,
increased risk of osteoporotic limb fractures, and shortened
life expectancy [4]. There is, therefore, a continuing interest
in identifying independent predictors of vertebral fractures
that could facilitate the detection of high-risk patients, who
would benefit the most from early prevention.
Vertebral fractures are often diagnosed and graded by
experienced radiologists using qualitative [5] or semi-
quantitative methods such as those described by Genant et
al. [6] and others [711]. The methods were shown to be
robust to intra-observer variations but may be difficult to
apply uniformly across different clinical centers. More
importantly, it is yet to be decided which one of these semi-
quantitative methods should be used as a gold standard
[12]. In order to overcome some of these problems, fully
quantitative methods have been developed [1323]. One of
the shortcomings of the discretenature (due to the use of
thresholds) of most of these methods is the inability to
quantify subtle changes in the vertebral shape. Hence, a
more robust and detailed study of the vertebral/spine shape
abnormalities should produce (1) an objective quantifiable
measure for detection and severity-grading of fractures and
(2) details of pre-fracture vertebral-shape changes that lead
to betterprediction of osteoporotic fractures.
The present study investigated whether computer-based
measures of fracture risk, calculated using vertebral pre-
fracture shape variations, could differentiate healthy sub-
jects who later sustain a vertebral fracture from those who
maintain vertebral integrity when matched for of an array of
traditional risk factors, including bone mineral density
(BMD). The rationale behind such an investigation is that
the detection of pre-fracture conditions and successful
prediction of vertebral fragility fractures will help the study
of osteoporosis in the following important ways: (1) early
diagnosis and treatment for patients, (2) more preciseassessment of efficacy of fracture prevention drugs by the
identification of subjects with a high likelihood of
sustaining an osteoporotic fracture, and (3) decrease in the
required sample size for clinical studies by inclusion of
high-risk subjects. In this paper, we pursue parts of the
second and third objectives and present quantitative
analyses of pre-fracture vertebral-shape changes that yield
subject scores indicating first-incidence lumbar vertebral
fracture risk. It is demonstrated how this score can be used
to rank a screening population and drive the selection of a
net study population with increased average fracture risk.
Materials and methods
A casecontrol study was designed such that case-group
patients had no prevalent lumbar vertebra fractures and by
follow-up 6.3 years later had sustained at least one fracture
in the lumbar spine only. The control group maintained
skeletal integrity throughout and was matched with respect
to an array of traditional risk factors.
The study population was chosen from the PERF cohort
[24] which consisted of 4,062 community-recruited post-
menopausal Danish women first screened between 1992 and
1995 and subsequently reviewed between 2000 and 2001.
PERF contained 662 patients with at least one new vertebral
fracture at follow-up; 88 of the 662 had no prevalent
fractures at baseline and of those, 25 suffered incident
fractures in the lumbar region (T12 to L5) only and were
selected as case patients. The case group was matched by a
fracture-free 101 large control group also from the PERF
cohort with comparable risk factors such as age, body mass
index (BMI), family history of osteoporosis, alcohol and
milk consumption, history of hormone replacement therapy,
spine BMD, smoking habits, and self-reported physical
exercise. Any patients with non-osteoporotic vertebral
deformities or non-osteoporotic fractures were excluded
before case and control groups were selected.
At baseline, none of the 126 subjects displayed any sign of
disorders of calcium metabolism or bone disease, or took any
medication known to affect bone metabolism. All subjects
were interviewed to obtain information on their medical
history, use of medication, and other life style factors. Subjects
underwent a complete physical examination; weight and
height were determined without shoes and with the subjects
2120 Osteoporos Int (2011) 22:21192128
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/13/2019 60939094
3/11
wearing light indoor clothes, and the BMI was calculated.
BMD of the spine was determined by bone densitometry
using a Lunar Prodigy scanner and lateral radiographs of the
lumbar spine were taken of the patients using a standard
technique [24]. Written consent was obtained from each
participant according to the Helsinki Declaration II. The
study was approved by the local ethics committee.
Spine radiographs acquisition and fracture assessment
Spinal examinations were performed according to pre-
approved protocols. Radiographs of the lumbar region were
taken for each of the subjects at baseline and follow-up. In the
lateral position, pillows were used to ensure good alignment
of the vertebral bodies. The distance between the focal plane
and the film was kept constant at 1.2 m and the central beam
was directed to L2. Patients were asked to hold in their
expiration for the duration of the radiograph acquisition. The
same group of staff examined each of the subjects. Anterior
posterior (AP) radiographs were used for a general view andassessment of vertebral deformities (i.e., scoliosis). Obvious
vertebral fractures in AP radiographs were noted but the
primary fracture assessment was performed on lateral radio-
graphs. Fracture assessment and classification from the
original PERF study [24] was re-evaluated and confirmed
using Genants semi-quantitative method [6] by PP who was
trained in and had several years of experience using the
method. Baseline and follow-up radiographs were viewed
simultaneously to avoid confusing fracture incidence with
undetected or borderline prevalence. The primary utility of
the fracture assessment was to establish baseline and follow-
up fracture presence. These readings were used to identify
the case (n=36) and controls group (n=108) in the paper by
Pettersen [25] in which the fracture risk predictive power of
a computer-based measure of curvature irregularity was
investigated. The case and control groups in this study were
specifically selected as subsets of the Pettersens case and
control group as described below.
Digitization of radiographs and six-point annotations
All lateral radiographs were digitized at 45 m (570
DPI). For further analysis of the images, six points
(called the height points) were placed at the corners and
at the middle points of the vertebral endplates, by the
same radiologist using a computer program; see Fig. 1a.
Using these measured heights, all vertebrae were evaluat-
ed by a computer algorithm using a modification of
Genants methodology with a strict measured threshold of
0.2 as fracture absence/presence indicator, that is, a
vertebra was considered fractured if either of the ratios
between any of the anterior, middle, and posterior heights
was 0.8 or less.
Subpopulation and borderline fractures
The Pettersens[25] population was reduced selecting only
subjects where the quantitative fracture classification was in
agreement with the experts fracture assessment. This
procedure reduced the case and controls groups described
above to 25 cases and 101 controls, respectively. The
additional step deliberately filtered out subjects where there
was borderline disagreement between the SQ- and QM-
based fracture assessments. This prevented the idea that the
fracture prediction results reported in this paper were
influenced by such disagreements.
T12
L1
L2
L3
L4
L5
baFig. 1 aAn example of a lateral
lumbar radiograph with
six-point annotations in redand
vertebra labels in black. The
midpoints are always marked on
the lower of the endplate con-
tours. b The shape of vertebrae
for various values ofHmax, Hmincompared to Genants height
ratio. For illustration purposes,
the smallest heightHminis
placed to the left and the largestheightHmax is placed to the
right in each vertebra. Each
vertebra has the corresponding
height ratio noted. Theblue area
indicates the high-risk patients
according to VFR. The green
area indicates the high-risk
patients according to Genants
height ratio and as depicted
different patients may be
indicated as high-risk patients
Osteoporos Int (2011) 22:21192128 2121
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/13/2019 60939094
4/11
Computer-based prediction of vertebral fractures
Three quantitative scores based on vertebral height mor-
phometry were developed to predict first incident lumbar
fractures. The scores were named the vertebral fracture risk
(VFR), the most deformed height ratio (MDHR), and the
most deformed height anterior height posterior ratio
(MDHaHp). Each of the three scores was tested in twoversions: (1) only the most deformed vertebra determined
the score and (2) the average over vertebrae T12-L5
determined the score.
In the following, we initially describe the VFR score on
the most deformed vertebra in detail. The remaining two
scores and the average versions of all scores use the same
overall methodology as the VFR score and only deviations
from this are presented.
Computation of the VFR score consisted of two steps:
(1) selection of the most deformed vertebra and (2) scoring
of this vertebra to form a single score for the patient:
Step 1
Selection of the most deformed vertebra: For a given
patient, the vertebral height ratios of the smallest to the
largest anterior, middle, or posterior heights were computed
for all vertebrae. The vertebra among T12 to L5 with
minimal height ratio was denoted the most deformed and
chosen to represent each patient. Due to the patient
selection described above, all ratios were above 0.8.
Step 2
VFR scoring: Each selected vertebra was represented by the
maximal Hmax and minimal Hmin of its three vertebral
heights. The selected heights were normalized by the mean
of the vertebral heights in question:
Hmax maxfHant;Hmed;Hpostg=meanfHant;Hmed;Hpostg;Hmin minfHant;Hmed;Hpostg=meanfHant;Hmed;Hpostg
Consequently, Hmax1.0 and Hmin1.0 for any vertebra.
The relationship between Hmax, Hmin, and height ratios as
used in, for example, Genants method is illustrated in
Fig. 1b. Each patient in the case and control groups was
represented by their normalized height pair. Vertebral height
pairs in eac h gro up wer e ass ume d to be normally
distributed, that is following a bivariate normal distribution
in (Hmax, Hmin). The empirical mean and covariance
matrix were estimated as standard maximum likelihood
estimates for each of the case and control groups and the
likelihood Pof belonging to each group expressed as:
Pcase Hmax;Hmin N mcase;P
case Pcontrol Hmax;Hmin N mcontrol;
Pcontrol
For new vertebrae, the relative likelihood:
Pcase Hmax;Hmin = Pcase Hmax;Hmin Pcontrol Hmax;Hmin
of belonging to the case group was computed from the
estimated normal distributions. This relative likelihood ratio
was defined to be the VFR score. It is, as constructed, a
number between 0 and 1 representing the probability of
sustaining a fracture. It should be expected that a patientwith a clear prevalent fracture will have a VFR score close
to 1; namely that the chance of sustaining a future fracture
is trivially very high unless the already fractured vertebra is
excluded from the score calculation. The VFR for all
patients was computed in a leave-one-out procedure to
avoid bias and underestimation of the variance. The explicit
modeling of shape variations through bivariate normal
distributions as opposed to single thresholds on, say, height
ratios allowed for a fuller representation of both normal and
fracture-prone shape variation.
The remaining two scores are calculated using the same
overall methodology as the VFR score with the followingexceptions: For MDHR, a vertebra was represented by the
minimal height ratio and the most deformed representative
selected as described above. For the MDHaHp ratio, the
minimal height ratio was calculated based on the anterior and
posterior heights only but was otherwise identical to MDHR.
This means that both the MDHR and MDHaHp scores have
one-dimensional representations (ratios) and the fitted normal
distributions were thus univariate. Furthermore, the explicit
height normalization step from the VFR score was not needed
due to the implicit normalization through ratios.
All three scores were also tested in mean versions
(MVFR, MHR, and MHaHp) where a patient was repre-
sented as the mean of height pairs or ratios over vertebrae
T12-L5 instead of the single most deformed vertebra as
described above.
We compared these morphometric prognostic markers
based on analysis of the individual vertebrae with the
irregularity measure based on spine curvature suggested for
fracture prediction by Pettersen [25]. Any comparisons with
the Pettersens method are reported on the reduced dataset
presented in this paper for both ours and Pettersens method.
Method for high-risk clinical study screening
The high-risk population selection-mechanism outlined
below was aimed at maximizing the number of subjects
most likely to sustain first incident lumbar vertebral
fractures in a trial population selected from a larger
screening population. Population selection is only described
for the VFR score but could also be realized using any of
the other five proposed scores.
The subpopulation selection-mechanism consisted of
three steps: (1) scoring all subjects from the gross
2122 Osteoporos Int (2011) 22:21192128
http://-/?-http://-/?-http://-/?-http://-/?- -
8/13/2019 60939094
5/11
population using VFR, (2) sorting them in descending VFR
order, and (3) selecting the required number of subjects;
here we selected the 50% with the highest VFR score.
The selection was evaluated on the same study popula-
tion as the scoring methods outlined above.
Statistical analysis
Results are presented as mean SEM unless otherwise
specified. The scores of different groups of subjects were
compared using the non-parametric MannWhitney U test.
Differences were considered statistically significant if p
values were less than 0.05.
The ability to separate cases from controls was further
characterized through the area under the ROC curve
(AUC). Significance of differences between AUC was
tested with Delongs method [26]. Odds ratios (ORs) and
95% confidence intervals between highest and lowest
tertiles are reported for each method and the significance
of differences between ORs was tested using Taronesvariant of the BreslowDay test [27].
As a test of the BMD influence on the predictive value
of the VFR score, logistic regression with the VFR score
and BMD as independent variables was used.
The subpopulation selection result is reported as the
relative number of cases above the median of the VFR
scores across both the case and control datasets.
To assess the inter-annotator stability of suggested
VFR score, the 126 baseline radiographs were six-point
annotated (T12-L5) an additional two times by two
experienced X-ray technicians. We report mean SEM,
AUC, and ORs including significance levels for the two
repeat annotations where the VFR score was trained on
the original annotation. To assess the inter-observer
stability of the suggested high-risk population selection
methodology, we further report the percentage of cases in
the top half of the VFR ordered dataset for each repeat
annotation.
We report an overall simulated estimate of expected
performance for repeat annotations through an estimate of
the annotation scatter observed across the three annotators.
The repeat annotations yielded a total of 126664; 500 vertebrae points annotated by three trained observ-ers. These data were used to estimate the mean and standard
deviation for inter-observer annotation variability forx and
yannotation coordinates separately. The original full dataset
was subsequently perturbed with normally distributed
variations in the x and y directions according to the
estimated means and standard deviations. The main
experiment described above was repeated using the
perturbed dataset. This random perturbation and subsequent
experimentation was repeated 50 times. We report the
median AUC and 95% confidence intervals and the median
percentage of cases in the top half over the 50 perturbation
trials.
The percentage of cases where the most deformed vertebra
at baseline is one of the fractured vertebrae at follow-up is
reported. This number is supplemented by the percentage of
cases where this would be observed by chance.
Finally, we report results of the main experiment on the
full Pettersen [25] dataset and compare to results achievedon the reduced dataset (see page 7) used throughout.
All data were analyzed using Matlab (Mathworks, USA).
Results
Study population
The skeletal and demographic characteristics of the case
and control groups are presented in Table 1 where the main
statistics reported for each group is the median value;
mean SD is given in parentheses for completeness. Basedon BMD measurements both the case and control groups
contained approximately half normal (non-osteoporotic)
and half osteopenic patients at baseline; furthermore both
groups contained two osteoporotic patients at baseline.
Fracture prediction
The three suggested morphometric fractures prediction scores
based on the single most deformed vertebra VFR, MDHR,
and MDHaHp showed significant differences between case
and control patients at baseline: VFR (0.670.04 vs. 0.35
0.03; p
-
8/13/2019 60939094
6/11
Neither MDHR nor MDHaHp was significantly better
than any of the mean variants.
The highest versus lowest tertile ORs with 95%
confidence intervals for the three suggested methods (both
most deformed and mean variants), Pettersens irregularity
measure, and BMD are given in Fig. 4. The VFR ORs was
significantly more predictive than Irregularity or BMD
alone (p=0.03 and p =0.004).
Additional results are given for the VFR score only:
Figure5is a box and whisker plot of the VFR scores for the
case and control group at baseline. There was a significant
difference between baseline and follow-up VFR scores for
both the case and the control groups: case (0.670.04 vs.
0.990.01; p
-
8/13/2019 60939094
7/11
Across the 50 datasets perturbed by typical inter-
annotator variation, the median AUC with 95% confi-
dence intervals was 0.73 (0.610.82) and the median
percentage of cases in the top half of the total population
was 76%.
The most deformed vertebra at baseline was one of the
fractured vertebrae at follow-up for 32% of the case
subjects. With an average of 1.2 fractures (Table 1) per
case subject, this would happen for 21% of cases by
chance.
The AUC and ORs for the main experiment repeated on
the full Pettersen dataset [25] were: AUC, 0.84 (0.770.90);
p
-
8/13/2019 60939094
8/11
forces. In the lumbar spine, the convexity is produced
partly by differences in growth of the anteriorposterior
parts of the vertebrae [28]. This leads to slight differences
in the heights usually confined to an acceptable range.
Subsequent changes in vertebral heights can lead to
changes in the spinal curvature and to a redistribution of
forces upon the vertebrae endplates. A vertebra is expected
to fracture when the loads imposed are similar to or greaterthan its strength [29]. From this we expect that an un-
fractured lumbar vertebra that presents an abnormal change
of one or more of three vertebral heights, due to, for
example, osteoporosis, is more likely to fracture or cause
fractures than the one which keeps within the normal range
of shape variations.
Inspired by this, three morphometric computer-based
scores, each in two variants, were trained to predict lumbar
fractures in postmenopausal women through modeled
variability of measured anterior, middle, and posterior
vertebral heights in lumbar vertebrae. They were applied
in a casecontrol study matched for BMD and the othermajor risk factors for osteoporosis. All three methods based
on the most deformed vertebra were able to significantly
differentiate subjects who would sustain at least one lumbar
vertebral fracture from those who maintained vertebral
integrity over a 6.3-year period. Furthermore, the results
suggest that the variants based on the most deformed
vertebra produced better fracture predictions than the
variants based on means across vertebrae T12-L5. Specif-
ically, the most deformed VFR variant yields significantly
better prediction than the MVFR and the MHR and a
Delong p value of 0.1 for MHaHp. Conversely, neither of
the other two scores based on the most deformed vertebra
was significantly better than VFR or any of the mean
variants. This suggests that the VFR score, which is based
on two normalized height measurements and thus 2 df,
measures more diverse shape variations (Fig. 1b) than any
o f the s ug ge sted 1 df (ratio) scores. Furthermore,
morphometry-based lumbar fracture prediction using the
single most deformed vertebra is stronger than prediction
based on an average or summary across the lumbar spine.
Finally, VFR delivers significantly better fracture prediction
than the curvature-based irregularity measure suggested by
Pettersen [25]. Although, L5 was included in this analysis
to make a direct comparison with Pettersens work possible,
the suggested methods (including VFR) are directly
applicable to datasets where only T12-L4 are annotated in
lumbar radiographs. Experiments (not included) on the
current data-set indicated a slight but not statistically
significant performance drop if only T12-L4 were included.
This is supported by the relatively larger morphological and
annotation induced variation of L5; in the context of this
work most likely too large to consistently add significant
information.
The VFR score showed an increased risk for sustaining
a second fracture in the case-group through a significant
difference between cases at baseline and follow-up. This is
in accordance with the literature [30, 31] and our expect-
ations in designing the VFR score. There was also a
significant difference for control-subjects measured at
baseline and follow-up pointing to an overall increas e in
risk for an incident fracture. Although the control groupremained fracture free, this increase in risk is not
unexpected over a 6.3 years observation period of
postmenopausal women where half were osteopenic at
baseline.
Our findings, on selection of high-risk subpopulations
were that a clear majority of the case-patients was found in
the top half of the combined cases and controls group
sorted by VFR. This suggests VFR as a viable supplement
to BMD and standard risk factors to select fracture free but
likely to fracture subjects from a general screening
population.
The discriminative performances of the two repeatannotations were not as high as the original annotation
but still significant and well within the confidence intervals
of simulated performance using estimated annotator scatter.
The same was the case for high-risk subpopulation
selection performance for both the individual repeat
annotations and the simulation. These performance drops
are not unexpected as the repeat sets were scored using the
original set as training/reference data and not in a leave-
one-out fashion and are in this sense less biased. Further-
more, the two repeat sets were annotated by trained X-ray
technicians and not radiologists as was the case for the
original set. Indirectly, this is reflected by the observed
larger SEMs of VFR scores within the case group for the
repeat annotators compared with the original annotation.
That fracture prediction of the suggested kind is more
sensitive to annotation quality and variation than, say,
standard SQ or QM-based fracture classification is, as
mentioned, not surprising. The discriminative factors of
VFR are subtle pre-fracture shape changes that are smaller
than standard implicit or explicit SQ and QM thresholds
[6]. Based on this, we emphasize that high quality,
consistent expert annotations are important in achieving
good separation. This is especially the case if the methods
were applied to less matched populations as in, say, clinical
trial screening. Here, one would need to train the algorithm
on a similar (same inclusion criteria) but independent
reference population prior to screening of novel subjects.
In that scenario, we would not expect discriminative
performance significantly above the reported median
simulated performance.
The vertebrae that fractured were also the most deformed
vertebrae for 32% of the cases which is somewhat higher
than was expected by chance alone (21%). This suggests
2126 Osteoporos Int (2011) 22:21192128
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/13/2019 60939094
9/11
that a high VFR score indicates an overall lumbar fracture
risk in terms of, e.g., an uneven biomechanical load
distribution or overall systemic effects indirectly signaled
by the most deformed vertebra.
Finally, the validation of the experiment on the full
Pettersens population [25] confirmed that the performed
population reduction to avoid predictive performance based
on borderline disagreement between SQ and QM did notlead to improved results as expected.
Limitations of the study
The subjects used in this study were all community-
recruited postmenopausal women roughly equally split
between normal and osteopenic. The case group was
pre-selected from a gross population of 4,062 as fracture
free at baseline and with at least one lumbar only
fracture at follow-up. The control group was matched
with respect to traditional osteoporosis risk factors and
maintained skeletal integrity throughout. Any patientswith non-osteoporotic deformities and/or fractures were
excluded whether these were real or caused by projec-
tion errors. The main results of this paper, although
promis ing, mus t be valida ted in future studies to
establish applicability on populations with fewer restric-
tions than described above.
This study focused on fracture risk in the lumbar region
(including T12). This was primarily done to facilitate easy
comparison with the related results in the Pettersens paper
[25] but also in recognition of the fact that pre-fracture
shape changes are more pronounced in the relatively larger
lumbar vertebrae. A natural generalization in future studies
will be to test if pre-fracture shape changes in thoracic
vertebrae also have fracture-predictive value. Furthermore,
thoracic and lumbar vertebrae are very likely to give a
better combined prediction of both lumbar and overall
fracture risk. It is our expectation that a single linear model
as presented in this paper would be insufficient to model
the range of combined lumbar and thoracic morphological
variance. We, however, believe that two such models
trained on thoracic and lumbar vertebrae respectively, in
combination would be sufficient.
For the purposes of the pre-selection and the study in
general, fracture absence and presence was established
using Genants semi-quantitative method [6] by both a
radiologist and subsequently by a computer algorithm using
manually measure vertebral heights with a strict fracture
threshold 0.2. In future studies, it would be relevant to
establish how robust the proposed fracture risk score are
with respect to changing the ground truth fracture
assessment methodology to other established methods as
suggested by, e.g., Eastell, Melton, Minne, and McCloskey
[13,18,19,23].
Conclusion
We have presented three morphometric scores each in
t wo va ri an ts . I n a c as ec on tro l s tud y b as e d o n
community-recruited postmenopausal women with the
limitations iterated above, all six variants were able to
pre dic t first inc ide nt lumba r fract ures. In gen eral,
variants based on only measuring the single mostdeformed vertebra showed more promise than the mean
variants. More specifically, the 2 df VFR score is
significantly better than two other mean-based variants
but not significantly better that the other two suggested
single vertebra morphometric methods. Subject to the
limitations iterated above and the availability of high-
quality radiologist six-point annotations, we conclude that
relative vertebral heights or height ratios measured on the
single most deformed vertebra used in combination with
machine learning techniques appears a promising ap-
proach for (1) first incident lumbar fracture prediction
and (2) selection of fracture-prone populations for clinicaltrials investigating treatment and prevention of osteopo-
rosis and osteoporotic fractures.
It is, however also, clear that VFR-based fracture
prediction is highly operator/annotator dependent; this and
a generalization to both lumbar and thoracic fracture
prediction will be the focus of future studies.
Acknowledgements The authors gratefully acknowledge the fund-
ing from the Danish Research Foundation (Den Danske Forsknings-
fond) supporting this work. The authors thank Jane Petersen and
Annette Olesen for the repeat annotations.
Conflicts of interest The VFR-methodology is part of a pendingpatent. Martin Lillholm is an employee of Synarc Imaging Techonol-
ogies/Nordic Bioscience Imaging (SIT/NBI). Anarta Ghosh is a former
employee of SIT/NBI. Paola C Pettersen is an employee of Center for
Clinical and Basic Research (CCBR). Erik B Dam is an employee of
SIT/NBI. Morten A Karsdal is an employee and shareholder of Nordic
Bioscience (NB). Claus Christiansen is an employee and shareholder of
NB and CCBR. Harry K Genant is an employee and shareholder of
Synarc. Mads Nielsen is partly funded by SIT/NBI. Marleen de Bruijne
was previously funded by Nordic Bioscience.
References
1. Rodan GA, Martin TJ (2000) Therapeutic approaches to bone
diseases. Science 289:1508
2. Watts NB (2001) Osteoporotic vertebral fractures. Neurosurg
Focus E12:10
3. Kanis JA, Borgstrom F, De Laet C, Johansson H, Johnell O,
Jonsson B, Oden A, Zethraeus N, Pfleger B, Khaltaev N (2005)
Assessment of fracture risk. Osteoporos Int 16:581589
4. Truumees E (2003) Medical consequences of osteoporotic
vertebral compression fractures. Instr Course Lect 52:551558
5. Jiang G, Eastell R, Barrington NA, Ferrar L (2004) Comparison of
methods for the visual identification of prevalent vertebral fracture
in osteoporosis. Osteoporos Int 17(11):887896
Osteoporos Int (2011) 22:21192128 2127
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
8/13/2019 60939094
10/11
6. Genant HK, Wu CY, van Kuijk C, Nevitt MC (1993) Vertebral
fracture assessment using a semiquantitative technique. J Bone
Miner Res 8:11371148
7. Ferrar L, Jiang G, Adams J, Eastell R (2005) Identification of
vertebral fractures: an update. Osteoporos Int 16:717728
8. Grados F, Roux C, de Vernejoul MC, Utard G, Sebert JL,
Fardellone P (2001) Comparison of four morphometric definitions
and a semiquantitative consensus reading for assessing prevalent
vertebral fractures. Osteoporos Int 12:716722
9. ONeill TW, Felsenberg D, Varlow J (2004) Diagnosis ofosteoporotic vertebral fractures: importance of recognition and
description by radiologists. Am J Roentgenol 183:949958
10. Stone J, Gurrin LC, Byrnes GB, Schroen CJ, Treloar SA, Padilla
EJ, Dite GS, Southey MC, Hayes VM, Hopper JL (2007)
Mammographic density and candidate gene variants: a twins and
sisters study. Cancer Epidemiol Biomark Prev 16:14791484
11. Nielsen VAH, Pdenphant J, Martens S, Gotfredsen A, Riis BJ
(1991) Precision in assessment of osteoporosis from spine radio-
graphs. Euro K Radiol 13:1114
12. Delmas PD, Langerijt L, Watts NB, Eastell R, Genant H, Grauer
A, Cahall DL (2005) Underdiagnosis of vertebral fractures is a
worldwide problem: the IMPACT study. J Bone Miner Res
20:557563
13. Eastell R, Cedel SL, Wahner HW, Riggs BL, Melton LJ (1991)
Classification of vertebral fractures. J Bone miner Res (Print)6:207215
14. Black DM, Palermo L, Nevitt MC, Genant HK, Epstein R, San
Valentin R, Cummings SR (1995) Comparison of methods for
defining prevalent vertebral deformities: the Study of Osteoporotic
Fractures. J Bone Miner Res 10:890902
15. Davies KM, Recker RR, Heaney RP (1989) Normal vertebral
dimensions and normal variation in serial measurements of
vertebrae. J Bone Miner Res 4:341349
16. Jensen KK, Tougaard L (1981) A simple X-ray method for
monitoring progress of osteoporosis. Lancet 2:1920
17. Kleerekoper M, Parfitt AM, Ellis BI (1984) Measurement of
vertebral fracture rates in osteoporosis. Copenhagen Int Symp
Osteoporos 1:103108
18. Melton LJ, Kan SH, Frye MA, Wahner HW, OFallon WM, Riggs
BL (2005) Epidemiology of vertebral fractures in women. Am J
Epidemiol 129:10001011
19. Minne HW, Leidig G, Wuster C, Siromachkostov L, Baldauf G,
Bickel R, Sauer P, Lojen M, Ziegler R (1988) A newly developed
spine deformity index (SDI) to quantitate vertebral crush fractures
in patients with osteoporosis. Bone Miner 3:335349
20. Reshef A, SchwartzA, Ben MenachemY, Menczel J, Guggenheim K
(1971) Radiological osteoporosis: correlation with dietary and
biochemical findings. J Am Geriatr Soc 19:391402
21. Ross PD, Yhee YK, He YF, Davis JW, Kamimoto C, Epstein RS,
Wasnich RD (1993) A new method for vertebral fracture
diagnosis. J Bone Miner Res 8:167174
22. Smith-Bindman R, Steiger P, Cummings SR, Genant HK (1991) The
index of radiographic area (IRA): a new approach to estimating the
severity of vertebral deformity. Bone Miner 15:137149
23. McCloskey EV, Spector TD, Eyres KS, Fern ED, ORourke N,Vasikaran S, Kanis JA (1993) The assessment of vertebral
deformity: a method for use in population studies and clinical
trials. Osteoporos Int 3(3):138147
24. Bagger YZ, Tanko LB, Alexandersen P, Hansen HB, Qin G,
Christiansen C (2006) The long-term predictive value of bone
mineral density measurements for fracture risk is independent of
the site of measurement and the age at diagnosis: results from the
prospective epidemiological risk factors study. Osteoporos Int
17:471477
25. Pettersen PC, de Bruijne M, Chen J, He Q, Christiansen C, Tanko
LB (2007) A computer-based measure of irregularity in vertebral
alignment is a BMD-independent predictor of fracture risk in
postmenopausal women. Osteoporos Int 18(11):15251530
26. DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing
the areas under two or more correlated receiver operating character-
istic curves: a nonparametric approach. Biometrics 44:837845
27. Tarone RE (1985) On heterogeneity tests based on efficient
scores. Biometrika 72(1):9195
28. Zebaze RMD, Maalouf G, Maalouf N, Seeman E (2004) Loss of
regularity in the curvature of the thoracolumbar spine: a measure
of structural failure. J Bone Miner Res 19:10991104
29. Duan Y, Seeman E, Turner CH (2001) The biomechanical basis of
vertebral body fragility in men and women. J Bone Miner Res
16:22762283
30. Hasserius R, Karlsson MK, Nilsson BE, Johnell O (2003)
Prevalent vertebral deformities predict increased mortality and
increased fracture rate in both men and women: a 10-year
population-based study of 598 individuals from the Swedish
cohort in the European Vertebral Osteoporosis Study. Osteoporos
Int 14:6168
31. Lunt M, ONeill TW, Felsenberg D, Reeve J, Kanis JA, Cooper C,
Silman AJ (2003) Characteristics of a prevalent vertebral
deformity predict subsequent vertebral fracture: results from the
European Prospective Osteoporosis Study (EPOS). Bone 33:505
513
2128 Osteoporos Int (2011) 22:21192128
-
8/13/2019 60939094
11/11
Copyright of Osteoporosis International is the property of Springer Science & Business Media B.V. and its
content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's
express written permission. However, users may print, download, or email articles for individual use.