diy fractional polynomials patrick royston mrc clinical trials unit, london 10 september 2010
TRANSCRIPT
![Page 1: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/1.jpg)
DIY fractional polynomials
Patrick RoystonMRC Clinical Trials Unit , London
10 September 2010
![Page 2: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/2.jpg)
Overview
• Introduction to fractional polynomials• Going off-piste: DIY fractional polynomials• Examples
![Page 3: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/3.jpg)
3
![Page 4: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/4.jpg)
Fractional polynomial models
• A fractional polynomial of degree 1 with power p1 is defined as FP1 = β1 X p1
• A fractional polynomial of degree 2 with powers (p1,p2) is defined as FP2 = β1 X p1 + β2 X p2
• Powers (p1,p2) are taken from a predefined set
S = {2, 1, 0.5, 0, 0.5, 1, 2, 3} where 0 means log X Also, there are ‘repeated’ powers FP2 models
Example: FP1 [power 0.5] = β1 X0.5
Example: FP2 [powers (0.5, 3)] = β1 X0.5 + β2 X
3
Example: FP2 [powers (3, 3)] = β1 X3 + β2 X
3lnX
![Page 5: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/5.jpg)
Some examples of fractional polynomial (FP2) curves
(-2, 1) (-2, 2)
(-2, -2) (-2, -1)
Royston P, Altman DG (1994) Applied Statistics 43: 429-467.
![Page 6: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/6.jpg)
FP analysis for the prognostic effect of age in breast cancer
![Page 7: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/7.jpg)
FP function selection procedure
Simple functions are preferred. More complicated functions are accepted only if the fit is much better
Effect of age significant at 5% level?
χ2 df P-value
Any effect? Best FP2 versus null 17.61 4 0.0015
Linear function suitable?Best FP2 versus linear 17.03 3 0.0007
FP1 sufficient?Best FP2 vs. best FP1 11.20 2 0.0037
![Page 8: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/8.jpg)
Fractional polynomials in Stata
• fracpoly command• Basic syntax:
. fracpoly [, fp_options]: regn_cmd [yvar] xvar1 [xvars] …
• xvar1 is a continuous predictor which may have a curved relationship with yvar
• xvars are other predictors, all modelled as linear• Can use the fp_option compare to compare the fit of
different FP models• uses the FP function selection procedure
![Page 9: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/9.jpg)
Example (auto data)
• fracpoly, compare: regress mpg displacement
Fractional polynomial model comparisons:--------------------------------------------------------------------------displacement df Deviance Res. SD Dev. dif. P (*) Powers--------------------------------------------------------------------------Not in model 0 468.789 5.7855 70.818 0.000 Linear 1 417.801 4.12779 19.830 0.000 1m = 1 2 400.592 3.67467 2.621 0.284 -2m = 2 4 397.971 3.6355 -- -- -2 3--------------------------------------------------------------------------(*) P-value from deviance difference comparing reported model with m = 2
model
• Show FP1 and FP2 models in Stata (+ fracplot)
![Page 10: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/10.jpg)
But what if fracpoly can’t fit my model … ?
• fracpoly supports only some of Stata’s rich set of regression-type commands
• Provided we know what the command we want to fit looks like with a transformed covariate, we can fit an FP model to the data
• We just create the necessary transformed covariate values, fit the model using them, and assess the fit
• A new, simple command fracpoly_powers helps by generating strings (local macros) with the required powers:
. fracpoly_powers [, degree(#) s(list_of_powers) ]
![Page 11: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/11.jpg)
Fitting an FP2 model in the auto example
// Store FP2 powers in local macrosfracpoly_powers, degree(2)local np = r(np)forvalues j = 1 / `np' {
local p`j' `r(p`j')'}// Compute deviance for each model with covariate displacementlocal x displacementlocal y mpglocal devmin 1e30quietly forvalues j = 1 / `np' {
fracgen `x' `p`j'', replaceregress `y' `r(names)'local dev = -2 * e(ll)if `dev' < `devmin' {
local pbest `p`j''local devmin `dev'
}}di "Best model has powers `pbest', deviance = " `devmin'
![Page 12: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/12.jpg)
A real example: modelling fetal growth
• Prospective longitudinal study of n = 50 pregnant women• There are about 6 repeated measurements on each fetus at
different gestational ages (gawks)• gawks = gestational age in weeks
• Wish to model how y = log fetal abdominal circumference changes with gestational age
• There is considerable curvature!
![Page 13: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/13.jpg)
The raw data
44
.55
5.5
6L
og A
C
10 20 30 40Gestational age, wk
![Page 14: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/14.jpg)
A mixed model for fetal growth
Multilevel (mixed) model to fit this relationship:
. xtmixed y FP(gawks) || id: FP(gawks), covariance(unstructured)
But how do we implement “FP(gawks)” here?
We want the best-fitting FP function of gawks, with random effects for the parameters (β’s) of the FP model
![Page 15: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/15.jpg)
Fitting an FP2 mixed model to the fetal AC data
[First run fracpoly_powers to create local macros with powers]// Compute deviance for each FP model with covariate gawksgen x = gawksgen y = ln(ac)local devmin 1e30forvalues j = 1 / `np' {
qui fracgen x `p`j'', replace adjust(mean)qui xtmixed y `r(names)' || id: `r(names)', ///
nostderr covariance(unstructured)local dev = -2 * e(ll)if `dev' < `devmin' {
local p `p`j''local devmin `dev'
}di "powers = `p`j''" _col(20) " deviance = " %9.3f `dev'
}di _n "Best model has powers `p', deviance = " `devmin'
![Page 16: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/16.jpg)
Plots of some results
44.
55
5.5
6Lo
g A
C
10 20 30 40Gestational age, wk
Fitted curves at the individual level
-.2
-.1
0.1
.2R
esid
uals
10 20 30 40Gestational age, wk
Residuals at the individual level
-.2
-.1
0.1
.2R
esid
uals
10 20 30 40Gestational age, wk
Residuals and fitted residuals
![Page 17: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/17.jpg)
An “ignorant” example!
• I know almost nothing about “seemingly unrelated regression” (Stata’s sureg command)
• It fits a set of linear regression models which have correlated error terms
• The syntax therefore has a set of “equations”
. sureg (depvar1 varlist1) (depvar2 varlist2) ... (depvarN varlistN)
• There may be non-linearities lurking in these “equations”• How can we fit FP models to varlist1, varlist2, … ?
![Page 18: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/18.jpg)
Example: modelling learning scores
Stata FAQ from UCLA(http://www.ats.ucla.edu/stat/stata/faq/sureg.htm):
What is seemingly unrelated regression and how can I perform it in Stata?
Example: High School and Beyond study
![Page 19: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/19.jpg)
Example: modelling learning scores
Contains data from hsb2.dta obs: 200 highschool and beyond (200 cases) vars: 11 5 Jul 2010 13:23 size: 9,600 (99.9% of memory free)------------------------------------------------------------------------------- storage display valuevariable name type format label variable label-------------------------------------------------------------------------------id float %9.0g female float %9.0g fl race float %12.0g rl ses float %9.0g sl schtyp float %9.0g scl type of schoolprog float %9.0g sel type of programread float %9.0g reading scorewrite float %9.0g writing scoremath float %9.0g math scorescience float %9.0g science scoresocst float %9.0g social studies score-------------------------------------------------------------------------------
• [It is unclear to me what “ses” (low, middle, high) is]
![Page 20: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/20.jpg)
Example (ctd.)
• As an example, suppose we wish to model 2 outcomes (read, math) as predicted by “socst female ses” and “science female ses” using sureg as follows:
. sureg (read socst female ses) (math science female ses)
• Are there non-linearities in read as a function of socst?In math as a function of science?
• For simplicity here, will restrict ourselves to FP1 functions of socst and science• not necessary in principle
• We fit the 8 × 8 = 64 FP1 models and look for the best-fitting combination
![Page 21: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/21.jpg)
Stata
gen x1 = socstgen x2 = sciencegen y1 = readgen y2 = mathlocal devmin 1e30forvalues j = 1 / `np' {
qui fracgen x1 `p`j'', replace adjust(mean)local x1vars `r(names)'forvalues k = 1 / `np' {
qui fracgen x2 `p`k'', replace adjust(mean)local x2vars `r(names)'qui sureg (y1 `x1vars' female ses) (y2 `x2vars' female ses)local dev = -2 * e(ll)if `dev' < `devmin' {
local px1 `p`j''local px2 `p`k''local devmin `dev'
}}
}
[Run fpexample3.do in Stata]
![Page 22: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/22.jpg)
Comments
• The results suggest that there is indeed curvature in both relationships
• Can reject the null hypothesis of linearity at the 1% significance level• FP1 vs linear: χ2 = 10.08 (2 d.f.), P = 0.0065
• Shows the importance of considering non-linearity
![Page 23: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/23.jpg)
read as a function of socst(adjusted female ses)
30
40
50
60
70
80
Pa
rtia
l pre
dic
tor+
resi
dua
l of r
ead
30 40 50 60 70social studies score
Fractional Polynomial (3),adjusted for covariates
![Page 24: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/24.jpg)
math as a function of science(adjusted female ses)
30
40
50
60
70
80
Pa
rtia
l pre
dic
tor+
resi
dua
l of m
ath
20 40 60 80science score
Fractional Polynomial (2),adjusted for covariates
![Page 25: DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010](https://reader035.vdocuments.pub/reader035/viewer/2022062421/56649dff5503460f94ae7539/html5/thumbnails/25.jpg)
Conclusions
• Fractional polynomial models are a simple yet very useful extension of linear functions and ordinary polynomials
• If you are willing to do some straightforward do-file programming, you can apply them in a bespoke manner to a wide range of Stata regression-type commands and get useful results
• For (much) more, see Royston & Sauerbrei (2008) book