university of zurich - uzh · 2010. 11. 29. · genome-wide association study identifies five...
TRANSCRIPT
University of ZurichZurich Open Repository and Archive
Winterthurerstr. 190
CH-8057 Zurich
http://www.zora.uzh.ch
Year: 2009
Genome-wide association study identifies five susceptibility locifor glioma
Shete, S; Hosking, F J; Robertson, L B; Dobbins, S E; Sanson, M; Malmer, B; Simon,M; Marie, Y; Boisselier, B; Delattre, J Y; Hoang-Xuan, K; El Hallani, S; Idbaih, A;
Zelenika, D; Andersson, U; Henriksson, R; Bergenheim, A T; Feychting, M; Lönn, S;Ahlbom, A; Schramm, J; Linnebank, M; Hemminki, K; Kumar, R; Hepworth, S J;Price, A; Armstrong, G; Liu, Y; Gu, X; Yu, R; Lau, C; Schoemaker, M; Muir, K;
Swerdlow, A; Lathrop, M; Bondy, M; Houlston, R S
Shete, S; Hosking, F J; Robertson, L B; Dobbins, S E; Sanson, M; Malmer, B; Simon, M; Marie, Y; Boisselier, B;Delattre, J Y; Hoang-Xuan, K; El Hallani, S; Idbaih, A; Zelenika, D; Andersson, U; Henriksson, R; Bergenheim, AT; Feychting, M; Lönn, S; Ahlbom, A; Schramm, J; Linnebank, M; Hemminki, K; Kumar, R; Hepworth, S J; Price,A; Armstrong, G; Liu, Y; Gu, X; Yu, R; Lau, C; Schoemaker, M; Muir, K; Swerdlow, A; Lathrop, M; Bondy, M;Houlston, R S (2009). Genome-wide association study identifies five susceptibility loci for glioma. Nature Genetics,41(8):899-904.Postprint available at:http://www.zora.uzh.ch
Posted at the Zurich Open Repository and Archive, University of Zurich.http://www.zora.uzh.ch
Originally published at:Nature Genetics 2009, 41(8):899-904.
Shete, S; Hosking, F J; Robertson, L B; Dobbins, S E; Sanson, M; Malmer, B; Simon, M; Marie, Y; Boisselier, B;Delattre, J Y; Hoang-Xuan, K; El Hallani, S; Idbaih, A; Zelenika, D; Andersson, U; Henriksson, R; Bergenheim, AT; Feychting, M; Lönn, S; Ahlbom, A; Schramm, J; Linnebank, M; Hemminki, K; Kumar, R; Hepworth, S J; Price,A; Armstrong, G; Liu, Y; Gu, X; Yu, R; Lau, C; Schoemaker, M; Muir, K; Swerdlow, A; Lathrop, M; Bondy, M;Houlston, R S (2009). Genome-wide association study identifies five susceptibility loci for glioma. Nature Genetics,41(8):899-904.Postprint available at:http://www.zora.uzh.ch
Posted at the Zurich Open Repository and Archive, University of Zurich.http://www.zora.uzh.ch
Originally published at:Nature Genetics 2009, 41(8):899-904.
Genome-wide association study identifies five susceptibility locifor glioma
Abstract
To identify risk variants for glioma, we conducted a meta-analysis of two genome-wide associationstudies by genotyping 550K tagging SNPs in a total of 1,878 cases and 3,670 controls, with validation inthree additional independent series totaling 2,545 cases and 2,953 controls. We identified five risk locifor glioma at 5p15.33 (rs2736100, TERT; P = 1.50 x 10(-17)), 8q24.21 (rs4295627, CCDC26; P = 2.34x 10(-18)), 9p21.3 (rs4977756, CDKN2A-CDKN2B; P = 7.24 x 10(-15)), 20q13.33 (rs6010620,RTEL1; P = 2.52 x 10(-12)) and 11q23.3 (rs498872, PHLDB1; P = 1.07 x 10(-8)). These data show thatcommon low-penetrance susceptibility alleles contribute to the risk of developing glioma and provideinsight into disease causation of this primary brain tumor.
Genome-wide association study identifies six susceptibility loci for glioma
Sanjay Shete1*, Fay J Hosking2*, Lindsay B Robertson2, Sara E Dobbins2, Marc Sanson3,
Beatrice Malmer4, Matthias Simon5, Yannick Marie3, Blandine Boisselier3, Jean-Yves Delattre3,
Khe Hoang-Xuan3, Soufiane El Hallani3, Ahmed Idbaih3, Diana Zelenika6, Ulrika Andersson7,
Roger Henriksson7, A Tommy Bergenheim8, Maria Feychting8, Stefan Lönn9, Anders Ahlbom7,
Johannes Schramm9, Michael Linnebank10, Kari Hemminki11, Rajiv Kumar11, Sarah J
Hepworth12, Amy Price2, Georgina Armstrong1, Yanhong Liu1, Xiangjun Gu1, Robert Yu1, Minouk
Schoemaker , 14 Kenneth Muir13, Anthony Swerdlow14, Mark Lathrop4,15, Melissa Bondy1¥, Richard
S Houlston2¥
1. Department of Epidemiology, The University of Texas M.D. Anderson Cancer Center, Houston,
Texas 77030, USA.
2. Section of Cancer Genetics, Institute of Cancer Research, Sutton, Surrey. UK.
3. Service de Neurologie Mazarin et INSERM U 711, Hôpital de la Salpêtrière 47, Bd de l'Hôpital
75 013 Paris. France.
4. Department of Radiation Sciences, Oncology, Umeå University, Sweden.
5. Neurochirurgische Universitätsklinik, Sigmund-Freud-Str. 25, 53105 Bonn, Germany.
6. Centre National de Génotypage, IG/CEA, 2 rue Gaston Crémieux CP 5721, 91057 Evry
Cedex, France.
7. Department of Clinical Neuroscience, Umeå University, Sweden.
8. Institute of Environmental Medicine, Karolinska Institutet, Sweden.
9. Department of Medical Epidemiology and Biostatistics Karolinska Institutet, Sweden
10. Department of Neurology, University Hospital Zurich, Frauenklinikstrasse 26, CH-8091
Zurich, Switzerland.
11. Division of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), Im
Neuenheimer Feld 580, D-69120 Heidelberg, Germany.
12. Centre for Epidemiology and Biostatistics, Faculty of Medicine and Health, University of
Leeds, UK.
13. Division of Epidemiology and Public Health, University of Nottingham Medical School,
Queen's Medical Centre, Nottingham, UK.
14. Section of Epidemiology, Institute of Cancer Research, Sutton, UK.
15. Foundation Jean Dausett-CEPH, 27 Rue Julliette Dodu, 75010 Paris, France
*equal status at this position
¥Corresponding authors
1
Richard S Houlston
e-mail: [email protected]
Tel: +44 (0) 208 722 4175
Fax: +44 (0) 208 722 4365
Melissa Bondy
e-mail: [email protected]
Tel: 001 713-794-5264
Fax: 001 713-792-9568
2
To identify risk variants for glioma we conducted a meta-analysis of 2 genome-wide
association studies, based on genotyping 550K tagging SNPs in a total of 1,878 cases
and 3,670 controls, with validation in 3 additional independent series totaling 2,545 cases
and 2,953 controls. We identified 6 risk loci for glioma at 5p15.33 (rs2736100, TERT; P =
1.50 x 10-17), 8q24.21 (rs4295627, CCDC26; P = 2.34 x 10-18), 9p21.3 (rs4977756,
CDKN2A/p14(ARF)/CDKN2B; P = 7.24 x 10-15), 20q13.33 (rs6010620, RTEL1; P = 2.52 x 10-
12), 11q13.4 (rs7124728; P = 8.18 x 10-9) and 11q23.3 (rs498872, PHLDB1; P = 1.07 x 10-8).
These data show that common low-penetrance susceptibility alleles contribute to the risk
of developing glioma and provide insight into disease causation of this primary brain
tumor.
Gliomas account for ~80% of primary malignant brain tumors with ~21,000 individuals being
diagnosed with glioma each year in the US1. Most gliomas are associated with a dismal
prognosis despite surgery, radiotherapy and chemotherapy1. To date no lifestyle exposures have
been consistently linked to an increased risk of glioma except for ionizing radiation which is only
responsible for a very small number of cases1. Evidence for an inherited predisposition to glioma
is provided by the single gene disorders neurofibromatosis, tuberous sclerosis, retinoblastoma,
Li-Fraumeni and Turcot’s syndromes .1 These syndromes are, however, rare and collectively only
make a minor contribution to the 2-fold increased risk seen in first-degree relatives of patients
with primary brain tumors (PBT)2. It is therefore likely that part of the familial risk is a
consequence of the co-inheritance of multiple low-risk variants, some of which may be common
and hence detectable through association analyses. Predicated on this hypothesis we have
conducted 2 genome-wide association (GWA) studies to identify common variants influencing
the risk of developing glioma. Pooling data from both GWA scans and following up the best-
supported associations in 3 independent case-control series has enabled us to identify 6 novel
predisposition loci for glioma.
The two GWA studies were conducted by centers in the UK and US; henceforth referred to as
the UK-GWA and US-GWA studies respectively (Figure 1). In both GWA studies genotyping of
cases was conducted using Illumina Infinium HD Human610-Quad BeadChips.
The UK-GWA study was based on genotyping 636 glioma cases ascertained through the
INTERPHONE Study3. After applying strict quality control criteria, genotype data were available
for 631 cases (Figure 1). For controls we made use of publicly accessible Illumina Hap550K
BeadChip genotype data generated on 1,438 individuals from the British 1958 Birth Cohort (58C,
also known as the National Child Development Study)4, which included all live births in England,
Wales and Scotland during a single week in 1958.
3
The US-GWA study was based on genotyping 1,281 glioma cases ascertained through an
ongoing collection being made by the MD Anderson Cancer Center. After applying strict quality
control criteria, genotype data were available for 1,247 cases (Figure 1). For controls we made
use of publicly accessible Illumina Hap550K BeadChip genotype data generated on 2,243
individuals from the Cancer Genetic Markers of Susceptibility (CGEMS) studies of breast and
prostate cancer5,6.
Across both case series total of 572,571 SNPs were satisfactorily genotyped (99.4%), with mean
individual sample call rates (the percentage of samples for which a genotype was obtained for
each SNP) of 99.8%. We excluded 32 individuals because of non-Western European Ancestry
and 2 because of cryptic relatedness (Figure 1; Supplementary Figure 1). 521,318 SNP
genotypes were available on all 1,878 cases and 3,670 controls in the combined data. Prior to
undertaking the meta-analysis we searched for potential errors and biases in the 2 GWA studies
and imposed a high stringency for quality control for SNPs taken forward. Specifically, we
considered only the 454,576 autosomal SNPs which had call rates >95% in all case and control
series, which showed no extreme departure from Hardy-Weinberg equilibrium (HWE; P>10-5 in
controls) and had minor allele frequencies (MAF) exceeding 1% in cases and controls (Figure 1).
Comparison of the observed and expected distributions showed little evidence for an inflation of
the test statistics in the 2 datasets (inflation factor7 = 1.02 and 1.05 for the UK-GWA and US-
GWA studies respectively based on the 90% least significant SNPs; Supplementary Figure 2),
thereby excluding the possibility of significant hidden population substructure or differential
genotype calling between cases and controls. Using data from both GWA studies we derived
joint odds ratios (ORs) and confidence intervals under a fixed effects model for each SNP and
associated P values from the standard normal distribution. The distribution of the association P
values was significantly skewed from the null distribution; 34 of the SNPs having a P value<10-5,
greater than the 4.5 conservatively expected under the null hypothesis (P <10-15, binomial test).
To identify true risk alleles amongst these 34 SNPs, we conducted replication in 3 independent
case-control series involving a total of 5,498 individuals that passed quality control filtering
(French series, 1,392 cases and 1,602 controls; German series 504 cases and 573 controls;
Swedish series, 649 cases and 778 controls). 31 of the 34 SNPs were satisfactorily genotyped in
all 3 case-control series. On the basis of the combined analyses we found that signals from 15 of
the 31 SNPs, representing 6 genomic regions, reached strong levels of evidence satisfying the
generally accepted threshold for genome-wide statistical significance (i.e. P < 10-7; Table 1,
Supplementary Table 1). For 3 SNPs, genotyping was not possible in at least one of the 3 series
using the analytical platforms available within each study centre. In the French case-control
4
series SNPs rs9656979 and rs7257116 were not genotyped. An association between these 2
SNPs and glioma risk was however not supported in either German or Swedish series
(Supplementary Table 1). While rs7300686 was not typed in either the German or Swedish
series an association with glioma risk was not supported in the French case-control series
(Supplementary Table 1). Hence failure to genotype all additional case-control series for the
three SNPs has not impacted on the study findings.
Under a fixed-effects model the strongest signal was attained at rs4295627 (combined OR =
1.36, 95% confidence interval [CI]: 1.29 - 1.43; P = 2.34 x 10-18) which localizes to 8q24.21
(130754639 bps; Figure 2, Supplementary Figure 3, Table 1). There was, however evidence of
between-study heterogeneity (Phet = 0.01) ascribable to the association being modest in the
Swedish series. Under a random effects model, the combined OR was 1.36 (95% CI: 1.19-1.58;
P = 2.01 x 10-5) using all case-control series data and 1.45 (95% CI: 1.32-1.57; P = 1.02 x 10-8)
excluding the Swedish study. rs4295627 maps to intron 3 of CCDC26 (coiled-coil domain
containing 26) a retinoic acid (RA) modulator of myeloid differentiation and death8. Retinoic acid
induces caspase-8 transcription through phosphorylation of CREB and increases apoptotic
responses to death stimuli in neuroblastoma cells9. Together with IFN-, RA induces
differentiation, apoptotic death in glioblastoma cells with down regulation of telomerase activity10
and it has been trialed as a chemotherapeutic agent11. There is some evidence to suggest a
second 8q24.21 locus defined by rs891835, which maps to intron 1 of CCDC26 207kb telomeric
to rs4295627 (130,560,934 bps; combined ORtrend = 1.24, 95% CI: 1.17–1.3; P = 7.54 x 10-11)
and displays low LD between with rs4295627 (D’ and r2 = 0.59, 0.19 and 0.58, 0.26 in HapMap
and GWA scan respectively). The OR for rs4295627 with adjustment for rs891835 was 1.30
(95% CI: 1.20-1.41; P = 2.80 x 10-10). Similarly, adjusting rs891835 for rs4295627 still provided
evidence of an association, albeit only at a nominal level of significance (OR= 1.08, 95% CI:
1.00-1.17; P = 0.03). There was also an increasing trend in OR with an increasing number of risk
alleles (P = 2.67 x 10-13) and comparison of haplotype frequencies indicates that the prevalence
of two distinct haplotypes differed between cases and controls (Supplementary Table 2). When
we imputed 8q24.21 genotypes using genome-wide data rs1077236, mapping to 130709683bps,
provided a marginally better association signal than rs4295627 in this dataset (P-values 2.75 x
10-9 and 1.60 x 10-8 respectively; Supplementary Figure 4). Hence, the possibility remains that
rs4295627 and rs891835, although only weakly associated with each other, are in LD with one
or more causal variants in this region.
The second strongest statistical evidence for an association was for rs2736100 (combined OR =
0.79, 95% CI: 0.73 – 0.84; P = 1.50 x 10-17) mapping to 5p15.33 which localizes to intron 2 of
TERT (1339516 bps; Figure 2, Supplementary Figure 3, Table 1). While this SNP is unlikely to
5
be directly causal, current knowledge implicates variation in TERT as the probable basis for the
association signal. TERT is the reverse transcriptase component of telomerase, essential for
telomerase activity in maintaining telomeres and is a critical to cell immortalization. While
5p15.33 copy number change is not common in glioblastoma multiforme (GBM; Supplementary
Figure 5) TERT expression has been shown to correlate with glioma grade and prognosis12.
Several lines of evidence raise the possibility of a second locus defined by rs2853676 which also
maps to intron 2 of TERT, 2kb telomeric to rs2736100 (1341547 bps; combined OR = 0.79, 95%
CI: 0.76-0.83; P = 4.21 x 10-14) and is in weak LD with rs2736100 (D’ and r2 = 0.67, 0.15 and
0.81, 0.24 in HapMap and GWA scan respectively). The OR for rs2736100 with adjustment for
rs2853676 was 0.84 (95% CI: 0.79 - 0.89; P = 5.33 x 10-8) and similarly, adjusting rs2853676 for
rs2736100 also provided evidence of an association, with little change in the risk estimate (OR=
0.87, 95% CI: 0.81 – 0.93; P = 6.90 x 10-5). There was an increasing trend in OR with an
increasing number of risk alleles (P = 1.19 x 10-19) and comparison of haplotype frequencies
provided evidence of two distinct haplotypes differing between cases and controls
(Supplementary Table 2). Recent data has implicated variation at 5p15.33 (TERT-CLMPTM1L)
defined by rs2736098 with risk of multiple tumor types including lung and bladder cancer13. As
LD between rs2736098 and both rs2736100 and rs2853676 is poor (D’ and r2 0.48, 0.11 and
0.28, 0.02 in HapMap and GWA scan respectively) whether the 5p15.33 association with glioma
has the same causal basis remains to be established.
The third strongest statistical evidence for an association was for rs4977756 (combined OR =
1.24, 95% CI: 1.19 - 1.30; P = 7.24 x 10-15), which maps 59kb telomeric to CDKN2B
(22058652bps; Figure 2, Supplementary Figure 3, Table 1) within a 122kb region of LD at
9p21.3. This region encompasses the CDKN2A/p14(ARF)/CDKN2B tumor suppressor genes.
CDKN2A encodes p16(INK4A) a negative regulator of cyclin-dependant kinases and p14 (ARF1)
an activator of p53. CDKN2A/p14(ARF)/CDKN2B has an established role in glioma with
mutations in CDKN2A reported to be detectable in ~50% of tumors14,15 and loss of expression
linked to poor prognosis12. Furthermore, germline mutation of the locus, specifically p14ARF has
been reported to cause the melanoma-astrocytoma syndrome16,17. Regulation of p16/p14ARF
has been observed to be important for sensitivity to ionizing radiation, the only environmental
factor strongly linked to gliomagenesis18. While rs4977756 is unlikely to be directly causal these
data provide evidence that common variation in addition to rare mutations in this region is a
basis of glioma predisposition.
The fourth strongest statistical evidence for an association was for rs6010620 (combined OR =
1.28, 95% CI: 1.21 – 1.35; P = 2.52 x 10-12) which localizes to intron 12 (61780283 bps) of the
RAD3-like helicase RTEL1 (regulator of telomere elongation helicase 1)19 and maps within a
65kb region of LD at 20q13.33 (Figure 2, Supplementary Figure 3, Table 1). Amplification of
6
20q13.33 encompassing RTEL1 is seen in ~30% of glioma. While copy number change
correlates with RTEL1 mRNA expression levels a similar relationship is seen with other genes
mapping to the region (Supplementary Figure 5). RTEL1 has recently been shown to maintain
genomic stability directly by suppressing homologous recombination19. These data coupled with
the established role of TERT in glioma biology makes RTEL1 an attractive candidate for the
association signal at 20q13.33. Alternatively, the 20q13.33 association may be mediated through
variation in other genes mapping to the region of LD, such as TNFRSF6B (alias DcR3) whose
expression in gliomas has been correlated with tumor grade20,21. TNFRSF6B is thought to play a
regulatory role in suppressing FasL- and LIGHT-mediated cell death and acts as a decoy
receptor competing with death receptors for ligand binding and hence immune evasion of
gliomas20,21.
The fifth strongest statistical evidence for an association was for rs7124728 (combined OR =
1.18, 95% CI: 1.13–1.24; P = 8.18 x 10-9). rs7124728 maps within a 92kb region of LD at
11q13.4 (70554083 bps; Figure 2, Supplementary Figure 3, Table 1), a region bereft of genes or
predicted transcripts. Furthermore, there are no protein-coding transcripts or predicted
genes/microRNAs in the vicinity of the marker (i.e. 300kb flanking sequence).
The sixth strongest signal was at rs498872 (combined OR = 1.18, 95% CI: 1.13 – 1.24; P = 1.07
x 10-8) which maps to the 5’ UTR of PHLDB1 (pleckstrin homology-like domain, family B,
member 1 protein; alias LL5) within a large LD block of 101kb on 11q23.3 (117982577bps;
Figure 2, Supplementary Figure 3, Table 1). There was some evidence of between-study
heterogeneity (Phet = 0.04) ascribable no association detectable in the German series. Under a
random effects model, the combined OR was 1.16 (95% CI: 1.05-1.28; P = 2.24 x 10-3) for all
studies and 1.22 (95% CI: 1.16-1.28; P = 2.56 x 10-10, Phet= 0.40) excluding the German series.
Although there is no direct evidence for a role of PHLDB1 or other genes mapping to the region
of LD at 11q23.3 (e.g. MLL, ARCN1, TREH) this genomic region is deleted in ~14% of gliomas
(Supplementary Figure 5) and commonly deleted in neuroblastoma22.
Given biological differences between the histological forms of glioma23, we have sought to
analyze the association between SNP genotypes and risk of tumors of glial origin, GBM and
astrocytic linage through case-only logistic regression. Accepting the limitations of pooling data
without central histopathological review, there was evidence that GBMs may have a different risk
profile notably with respect to rs4295627 (CCDC26), rs498872 (PHLDB1) and rs2736100
(TERT) (Supplementary Table 3). Due to significant between study heterogeneity we cannot
however exclude the possibility that the impact of variants on risk is generic in nature and is not
directed to influencing the risk of developing a specific histological form of glioma.
7
When we modeled pairwise combinations of all of the SNPs we only found evidence of
interactive effects between 8q24.21 SNPs (Supplementary Table 4). For all other pairwise
combinations each locus had an independent role in glioma development (P interaction >0.10). The
risk of glioma increases with increasing numbers of variant alleles for the 6 loci (ORper-allele=1.24,
95% CI: 1.21-1.28; P = 1.30 x 10-61; Supplementary Table 5). The proportion of case and control
subjects grouped according to the number of risk alleles is shown in Figure 3. The distribution of
risk alleles follows a normal distribution in both case and controls, with a shift towards a higher
number of risk alleles in cases. Figure 3 also shows the ORs relative to the median number of 5
risk alleles. Individuals with 10+ risk alleles have at least a 3-fold increase in risk of developing
glioma compared to those with a median number of risk alleles. These ORs may be slight
underestimates because the additive model imposed for allele counting assumes equal
weighting across all SNPs. We estimate the 6 loci we have identified to date through our GWA
account for ~10% of the excess familial risk of glioma (assuming both naïve multiplicative or
additive models). It is acknowledged, however, that the present data provide only crude
estimates of the overall effect on susceptibility attributable to variation at the 6 loci as the effect
of the actual common causal variant responsible for the association, once identified, will typically
be larger. Furthermore, many of the loci may carry additional causal variants, potentially
including low-frequency variants with larger influence on glioma risk.
While fine-mapping and resequencing is required to identify the specific variants underlying each
of the 6 associations so far identified, we conducted 2 analyses to investigate the potential basis
of causality. We interrogated HapMap to identify non-synonymous SNPs (nsSNPs) highly
correlated with the most strongly associated SNP within each region of association. The
strongest LD was between rs6010620 and nsSNP rs3848668 (D’ = 1.0, r2 = 0.05; Supplementary
Table 6). Accepting the caveat that HapMap is not comprehensive these data suggest the
associations identified so far are most probably mediated through LD with sequence changes
that influence gene expression rather than protein sequence or through LD with low frequency
variants (i.e. variants with MAF of 0.001-0.01) that are not catalogued by HapMap. To explore
whether any of the associations reflect cis-acting regulatory effects on a nearby gene, we
searched for genotype–expression correlations in lymphoblastoid cell lines24,25, and in normal
brain tissue26 using publicly available data. We did not, however, find any significant relationship
between any of the SNP genotypes and gene expression (Supplementary Figure 6). These
findings do not exclude subtle effects on gene expression or possibly that the variants exert
effects through untested transactivation of genes.
These results provide the first unambiguous evidence that common genetic variation influences
the risk of developing glioma. Our findings also provide insight into the biological basis of
predisposition. They highlight the importance of variation in genes encoding components of
8
CDKN2A-CDK4 signaling pathway in the development of glioma. Moreover this pathway,
elucidated through the extended interaction network of CDKN2A, incorporates TERT (through
mutual interaction with HSP90) and other genes (including CCDC26) we have identified as risk
factors for glioma. The identification of additional variants associated with glioma predisposition
should refine the allelic architecture of disease susceptibility and potentially provide further
insights into the biology of this PBT.
9
METHODS
Subjects
UK- GWA study
The UK study was based on 636 cases (401 male, 285 female; mean age at diagnosis 46 years;
SD 12) ascertained through the INTERPHONE Study3. Briefly, the INTERPHONE Study was an
international multi-centre case-control study of primary brain tumors coordinated by the
International Agency for Research on Cancer (IARC), with material collected between
September 2000 and February 2004. UK patients with PBT were collected through
neurosurgery, neuropathology, oncology, and neurology centers in the Thames regions of
Southeast England and the Northern UK including central Scotland, the West Midlands, West
Yorkshire, and the Trent area. Cases were patients with glioma [International Classification of
Diseases (ICD), 10th revision, code C71; International Classification of Diseases for oncology
(ICD-O), 2nd ed., codes 9380-9384, 9390-9411, 9420-9451, and 9505]. Cases with previous
brain tumors were excluded. To minimize population stratification cases with self reported non-
UK ancestry were excluded from the present study. Individuals from the 1958 Birth Cohort
served as source of controls4.
US-GWA study
The US study was based on 1,247 cases (768 male, 479 female; mean age at diagnosis 47
years; SD 13) ascertained through the MD Anderson Cancer Center, Texas, between 1990 and
2008. Cases were patients with glioma (ICD10 code C71; ICD-O codes 9380-9384, 9390-9411,
9420-9451, and 9505). Individuals from CGEMS5,6 served as controls.
Replication series
French Glioma Collection (FGC): systematic series of 1,439 patients with histologically proven
glioma (WHO classification AIII, AI, AIII, OII, OIII, OAII, OAII, OAIII, GBM-IV) ascertained
through the Service de Neurologie Mazarin, Groupe Hospitalier Pitié-Salpêtrière Paris. The
controls are taken from the SU.VI.MAX (SUpplementation en VItamines et
MinrauxAntioXydants) study of 12,735 healthy subjects (women aged 35 to 60 years; men aged
45 to 60 years) recruited from across France in 199427.
Swedish Glioma Collection (SGCCS): 215 glioma cases ascertained as part of the
INTERPHONE Study3 conducted between 2000 and 2002 in Sweden, 134 cases from Umeå
University Hospital, 122 from the Northern Sweden Health and Disease Study (NSHDS)28 and
203 from neurosurgery university clinics in Sweden. 383 controls for the INTERPHONE study
10
were randomly selected from the population register, frequency matched to brain tumor cases
on age, sex, and geographical region. In addition 400 controls were ascertained from NSHDS
that were age, gender and geographically representative of the cohort.
German Glioma Collection (GGC): 564 patients who underwent surgery for a glioma at the
Department of Neurosurgery, University of Bonn Medical Center, between 1996 and 2008. All
histological diagnoses were made at the Institute for Neuropathology/German Brain Tumor
Reference Center, University of Bonn Medical Center. Control subjects were 576 healthy
individuals (288 men, 288 women; mean age 31, range 18-45) of German origin recruited in
2004 by the Institute of Transfusion Medicine and Immunology, Mannheim.
Ethics
Collection of blood samples and clinico-pathological information from patients and controls was
undertaken with informed consent and relevant ethical review board approval in accordance with
the tenets of the Declaration of Helsinki.
Genotyping
DNA was extracted from samples using conventional methodologies and quantified using
PicoGreen (Invitrogen, Carlsbad, USA). A genome-wide scan of tag SNPs was conducted using
the Illumina Infinium HD Human610-Quad BeadChips according to the manufacturer's protocols
(Illumina, San Diego, USA; Supplementary Methods). DNA samples with GenCall scores <0.25
at any locus were considered “no calls”. A DNA sample was deemed to have failed if it
generated genotypes at <95% of loci. A SNP was deemed to have failed if fewer than 95% of
DNA samples generated a genotype at the locus. To ensure quality of genotyping, a series of
duplicate samples were genotyped in the same batches.
Subsequent genotyping of SNPs was conducted using either competitive allele-specific PCR
KASPar chemistry (KBiosciences Ltd, Hertfordshire, UK) or single-base primer extension
chemistry MALDI-TOF MS detection (Sequenom, San Diego, USA). All primers and probes used
are available on request. Genotyping quality control was further evaluated through inclusion of
duplicate DNA samples in SNP assays. For all SNP assays >99% concordant results were
obtained. Samples having SNP call rates <90% were excluded from the analysis.
Statistical analysis
Statistical analyses were undertaken using R (v2.6), STATA (version 8; State College, Texas,
US) and PLINK (version 1.05)29 software. Genotype data were used to search for duplicates and
11
closely related individuals amongst all samples in each of the GWA studies. Identity by state
(IBS) values were calculated for each pair of individuals, and for any pair with allele sharing of
>80%, the sample generating the lowest call rate was removed from further analysis. To identify
individuals who might have non-Western European ancestry, we merged our case and control
data with the 60 western European (CEU), 60 Nigerian (YRI), and 90 Japanese (JPT) and 90
Han Chinese (CHB) individuals from the International HapMap Project. For each pair of
individuals we calculated genome-wide IBS distances, on markers shared between HapMap and
our SNP panel, and used these as dissimilarity measures upon which to perform principal
component analysis. The first two principal components for each individual were plotted and any
individual not present in the main CEU cluster (i.e. outside 5% from cluster centroids) was
excluded from subsequent analyses.
In the UK-GWA study, genotyped samples were excluded from analyses for the following
reasons: relatedness (n=0) and non-CEU ancestry (n=3). In the US-GWA study, genotyped
samples were excluded from analyses for the following reasons: relatedness (n=2) and non-CEU
ancestry (n=23).
The adequacy of the case-control matching and possibility of differential genotyping of cases
and controls were formally evaluated using Q-Q plots of test statistics. Deviation of the genotype
frequencies in the controls from those expected under Hardy-Weinberg Equilibrium (HWE) was
assessed by a 2 test, or Fisher’s exact test where an expected cell count was <5.
The association between each SNP and risk of glioma was assessed by the Cochran-Armitage
trend test. Odds ratios and associated 95% CIs were calculated by unconditional logistic
regression. Relationships between multiple SNPs showing association with glioma risk in the
same region were investigated using logistic regression analysis, and the impact of additional
SNPs from the same region was assessed by a likelihood-ratio test.
The combined effect of each pair of risk locus was investigated by logistic regression modeling
with evidence for interactive effects between SNPs assessed by a likelihood ratio test. The OR
and trend test for increasing numbers of deleterious alleles was estimated by counting two for a
homozygote and one for a heterozygote.
Meta-analysis was conducted using standard methods based on weighted average of study-
specific estimates of the ORs, using inverse variance weights. Cochran’s Q statistic to test for
heterogeneity and the I2 statistic to quantify the proportion of the total variation due to
heterogeneity were calculated. The sibling relative risk attributable to a given SNP was
calculated using the formula:
12
where p is the population frequency of the minor allele, q=1-p, and r and r are the relative risks
(estimated as OR) for heterozygotes and rare homozygotes, relative to common homozygotes.
Assuming a multiplicative interaction the proportion of the familial risk attributable to a SNP was
calculated as log( )/log( ), where is the overall familial relative risk estimated from
epidemiological studies, assumed to be 2.1 .
1 2
*0 0
2 A naïve estimation of the contribution of all of the
loci identified to the excess familial risk of glioma under an additive model was calculated using
the formula:
Bioinformatics
LD metrics between SNPs reported in HapMap were based on Data Release 2/phase III Feb09
on NCBI B35 assembly, dbSNPb125, except for between rs2736098 and rs2736100 and
rs2853676 which were only available in Data Release 23a/phase II March08.
We used Haploview software (v3.2) to infer the LD structure of the genome in the regions
containing loci associated with disease risk. Prediction of the un-typed SNP in the case–control
data sets of GWA data was carried out using IMPUTE and SNPTEST on HapMap (HapMap
Data Rel 2/phase III Feb09 on NCBI B35 assembly, dbSNPb125).
To examine for a relationship between SNP genotype and mRNA expression in lymphocytes
and normal human cortex we used publicly available data. 90 Caucasian derived Epstein-Barr
virus–transformed lymphoblastoid cell lines were analyzed using Sentrix Human-6 Expression
BeadChips (Illumina, San Diego, USA). Online recovery of data was performed using
WGAViewer Version 1.25 Software24,25. 193 normal human cortex cells were analyzed using
Illumina HumanRef-seq-8 Expression BeadChip arrays26. Where SNPs genotyped in our study
had not been typed HAPMAP data were used to identify correlated SNPs in high LD (r2 >0.8).
Differences in the distribution of levels of mRNA expression between SNP genotypes were
compared using a Wilcoxon-type test for trend. Relationship between copy number change and
mRNA expression at loci in glioma was investigated using data from The Cancer Genome Atlas
(TCGA) 14. The Bioconductor module CGHcall was used to assign copy number status.
We searched for interactions between proteins encoded by genes mapping to association
signals using the program PINA30.
13
URLs
The R suite can be found at http://www.r-project.org/
Bioconductor: http://www.bioconductor.org/
Detailed information on the tag SNP panel can be found at http://www.illumina.com/
dbSNP: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=snp
HAPMAP: http://www.hapmap.org/
http://pipeline.lbl.gov/cgi-bin/gateway2
KBioscience: http://kbioscience.co.uk/
WGAViewer: http://www.genome.duke.edu/centers/pg2/downloads/wgaviewer.php
1958 Birth Cohort: http://www.cls.ioe.ac.uk/studies.asp?section=000100020003
PLINK: http://pngu.mgh.harvard.edu/~purcell/plink/
Cancer Genetic Markers of Susceptibility (CGEMS): http://cgems.cancer.gov/
The Cancer Genomics Data Portal: http://cbio.mskcc.org/cancergenomics-dataportal
The Cancer Genome Atlas (TCGA) data portal: http://tcga-data.nci.nih.gov/tcga/homepage.htm
A Survey of Genetic Human Cortical Gene Expression: http://labs.med.miami.edu/myers/
IMPUTE: http://www.stats.ox.ac.uk/~marchini/software/gwas/impute.html
SNPTEST: http://www.stats.ox.ac.uk/~marchini/software/gwas/snptest.html
PINA (Protein Interaction Network Analysis platform): http://csbi.ltdk.helsinki.fi/pina/
Note: Supplementary information is available on the Nature Genetics website
ACKNOWLEDGEMENTS
The Wellcome Trust provided principal funding for the study. In the UK additional funding was
provided by Cancer Research UK (C1298/A8362 supported by the Bobby Moore Fund) and the
European Union (CPRB LSHC-CT-2004-503465). The UK and Swedish INTERPHONE studies
were supported by the European Union Fifth Framework Program “Quality of life and
Management of Living Resources” (contract number QLK4-CT-1999-01563), and the
International Union against Cancer (UICC). The UICC received funds for this purpose from the
Mobile Manufacturers' Forum and GSM Association. Provision of funds via the UICC was
governed by agreements that guaranteed INTERPHONE's complete scientific independence.
These agreements are publicly available at http://www.iarc.fr/ENG/Units/RCAd.html. The
Swedish centre was also supported by the Swedish Research Council, the Cancer Foundation
of Northern Sweden, the Swedish Cancer Society, and the Nordic Cancer Union and the UK
centre by the Mobile Telecommunications and Health Research (MTHR) Programme and the
14
Health and Safety Executive, Department of Health and Safety Executive and the UK Network
Operators (O2, Orange, T-Mobile, Vodafone, and ‘3’). The views expressed in the publication are
those of the authors and not necessarily those of the funding bodies. In the US funding was
provided by NIH grants 5R01 CA119215 and 5R01 CA070917. Additional support was obtained
from the American Brain Tumor Association and the National Brain Tumor Society. MD
Anderson acknowledges the work of Phyllis Adatto, Fabian Morice, Hui Zhang, Victor Levin,
Alfred Yung, Mark Gilbert, Raymond Sawaya, Vinay Puduvalli, Charles Conrad, Fredrick Lang,
and Jeffrey Weinberg from the Brain and Spine Center. In France funding was provided by the
Délégation à la Recherche Clinique (MUL03012), the Association pour la Recherche sur les
Tumeurs Cérébrales (ARTC), the Institut National du Cancer (INCa; PL 046) and the French
Ministry of Higher Education and Research. In Germany funding was provided to MS, JS and ML
by the Deutsche Forschungsgemeinschaft (Si 552, Schr 285), the Deutsche Krebshilfe (70-
2385-Wi2, 70-3163-Wi3, 10-6262) and BONFOR. In Sweden the collection of samples from the
Northern Sweden Health of disease study were collected with support from Umeå University
Hospital and NIH funded part of GLIOGENE collection (R01 CA11915). BM was supported by
Acta Oncologica foundation as a research fellow at Royal Swedish Academy of Science. The
UK-GWA study made use of genotyping data on the 1958 Birth Cohort. Genotyping data on
controls was generated and generously supplied to us by Panagiotis Deloukas of the Wellcome
Trust Sanger Institute. A full list of the investigators who contributed to the generation of the data
is available from www.wtccc.org.uk. The US-GWA study made use of control genotypes from the
CGEMS prostate and breast cancer studies. A full list of the investigators who contributed to the
generation of the data is available from http://cgems.cancer.gov/. The results published here are
in whole or part based upon data generated by The Cancer Genome Atlas pilot project
established by the NCI and NHGRI. Information about TCGA and the investigators and
institutions that constitute the TCGA research network can be found at
http://cancergenome.nih.gov. Finally, we are grateful to all the patients and individuals for their
participation. We would also like to thank the clinicians and other hospital staff, cancer registries
and study staff who contributed to the blood sample and data collection for this study.
AUTHOR CONTRIBUTIONS
RSH and MB designed the study. RSH drafted the manuscript, with extensive contributions from
FH and SS. FH and SS performed statistical analyses. SED, FH and SS performed
bioinformatics analyses. LBR performed laboratory management and oversaw genotyping of UK
cases and performed genotyping of German and Swedish case-control series. In the UK: AS,
MS, KM, JSH developed patient recruitment, sample acquisition and performed sample
collection of cases. In the US: MB and GA developed patient recruitment; GA performed curation
15
and organization of samples and data; YL performed management of genotyping. In Germany:
MS and JS developed patient recruitment and blood sample collection. MS oversaw DNA
isolation and storage. KH and ML collected control samples. MS and ML case ascertainment
and supervision of DNA extractions. KH and RK procurement of German control samples. In
France: MS, J-Y D, K H-X, AI developed patient recruitment; YM, BB and S El-H developed
sample acquisition and performed sample collection of cases. In Sweden: For the Swedish
INTERPHONE Study MF, SL, and AA developed study design and conducted patient
recruitment and control selection, MF, SL, AA, BM, and RH organized sample acquisition and
performed sample collection of case and controls. BM and RH performed laboratory
management and oversaw DNA extraction. The NSHDS samples were collected by Umeå
University, (Principal Investigator Göran Hallmans), and the additional samples collected at the
neurosurgery department in Umeå from 2005 and onwards (TB, RH) and through the national
GLIOGENE study (Principal Investigator BM). All authors contributed to the final paper.
COMPETING INTERESTS STATEMENT
The authors declare no competing financial interests
16
REFERENCES 1. Bondy, M.L. et al. Brain tumor epidemiology: consensus from the Brain Tumor
Epidemiology Consortium. Cancer 113, 1953-68 (2008). 2. Hemminki, K. & Li, X. Familial risks in nervous system tumors. Cancer Epidemiol
Biomarkers Prev 12, 1137-42 (2003). 3. Cardis, E. et al. The INTERPHONE study: design, epidemiological methods, and
description of the study population. Eur J Epidemiol 22, 647-64 (2007). 4. Power, C. & Elliott, J. Cohort profile: 1958 British birth cohort (National Child
Development Study). Int J Epidemiol 35, 34-41 (2006). 5. Hunter, D.J. et al. A genome-wide association study identifies alleles in FGFR2
associated with risk of sporadic postmenopausal breast cancer. Nat Genet 39, 870-4 (2007).
6. Yeager, M. et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet 39, 645-9 (2007).
7. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997-1004 (1999).
8. Yin, W., Rossin, A., Clifford, J.L. & Gronemeyer, H. Co-resistance to retinoic acid and TRAIL by insertion mutagenesis into RAM. Oncogene 25, 3735-44 (2006).
9. Jiang, M., Zhu, K., Grenet, J. & Lahti, J.M. Retinoic acid induces caspase-8 transcription via phospho-CREB and increases apoptotic responses to death stimuli in neuroblastoma cells. Biochim Biophys Acta 1783, 1055-67 (2008).
10. Das, A., Banik, N.L. & Ray, S.K. Differentiation decreased telomerase activity in rat glioblastoma C6 cells and increased sensitivity to IFN-gamma and taxol for apoptosis. Neurochem Res 32, 2167-83 (2007).
11. Jaeckle, K.A. et al. Phase II evaluation of temozolomide and 13-cis-retinoic acid for the treatment of recurrent and progressive malignant glioma: a North American Brain Tumor Consortium study. J Clin Oncol 21, 2305-11 (2003).
12. Wager, M. et al. Prognostic molecular markers with no impact on decision-making: the paradox of gliomas based on a prospective study. Br J Cancer 98, 1830-8 (2008).
13. Rafnar, T. et al. Sequence variants at the TERT-CLPTM1L locus associate with many cancer types. Nat Genet 41, 221-7 (2009).
14. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061-8 (2008).
15. Parsons, D.W. et al. An integrated genomic analysis of human glioblastoma multiforme. Science 321, 1807-12 (2008).
16. Randerson-Moor, J.A. et al. A germline deletion of p14(ARF) but not CDKN2A in a melanoma-neural system tumour syndrome family. Hum Mol Genet 10, 55-62 (2001).
17. Bahuau, M. et al. Germ-line deletion involving the INK4 locus in familial proneness to melanoma and nervous system tumors. Cancer Res 58, 2298-303 (1998).
18. Simon, M., Voss, D., Park-Simon, T.W., Mahlberg, R. & Koster, G. Role of p16 and p14ARF in radio- and chemosensitivity of malignant gliomas. Oncol Rep 16, 127-32 (2006).
19. Barber, L.J. et al. RTEL1 maintains genomic stability by suppressing homologous recombination. Cell 135, 261-71 (2008).
20. Roth, W. et al. Soluble decoy receptor 3 is expressed by malignant gliomas and suppresses CD95 ligand-induced apoptosis and chemotaxis. Cancer Res 61, 2759-65 (2001).
21. Arakawa, Y. et al. Frequent gene amplification and overexpression of decoy receptor 3 in glioblastoma. Acta Neuropathol 109, 294-8 (2005).
22. Guo, C. et al. Allelic deletion at 11q23 is common in MYCN single copy neuroblastomas. Oncogene 18, 4948-57 (1999).
23. Endersby, R. & Baker, S.J. PTEN signaling in brain: neuropathology and tumorigenesis. Oncogene 27, 5416-30 (2008).
24. Stranger, B.E. et al. Genome-wide associations of gene expression variation in humans. PLoS Genet 1, e78 (2005).
17
18
25. Stranger, B.E. et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848-53 (2007).
26. Myers, A.J. et al. A survey of genetic human cortical gene expression. Nat Genet 39, 1494-9 (2007).
27. Hercberg, S. et al. The SU.VI.MAX Study: a randomized, placebo-controlled trial of the health effects of antioxidant vitamins and minerals. Arch Intern Med 164, 2335-42 (2004).
28. Hallmans, G. et al. Cardiovascular disease and diabetes in the Northern Sweden Health and Disease Study Cohort - evaluation of risk factors and their interactions. Scand J Public Health Suppl 61, 18-24 (2003).
29. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81, 559-75 (2007).
30. Wu, J. et al. Integrated network analysis platform for protein-protein interactions. Nat Methods 6, 75-7 (2009).
Table 1: Summary of results for glioma SNPs replicated. Gene(s) mapping within 20kb of each SNP are detailed
GWA studies Replication studies Combined
SNP Chr Gene Location
(bp)
Ancestral allele
frequency Ancestral
allele
Non-ancestral
allele OR
(95% CI) P OR
(95% CI) P OR
(95% CI) P P het
rs2736100 5 TERT 1,339,516 0.51 T G 0.83
(0.75-0.91) 2.21E-06 0.75
(0.67-0.83) 2.87E-13 0.79
(0.73-0.84) 1.50E-17 0.18
rs2853676 5 TERT 1,341,547 0.27 G A 1.22
(1.14-1.31) 5.30E-06 1.3 (1.21-
1.38) 1.06E-09 1.26
(1.2-1.32) 4.21E-14 0.67
rs10464870 8 CCDC26 130,547,005 0.21 T C 1.24
(1.15-1.34) 3.90E-06 1.22
(1.13-1.31) 1.77E-05 1.23
(1.17-1.3) 3.04E-10 0.05
rs891835 8 CCDC26 130,560,934 0.22 T G 1.24
(1.15-1.33) 3.92E-06 1.24
(1.15-1.33) 4.43E-06 1.24
(1.17-1.3) 7.54E-11 0.01
rs6470745 8 CCDC26 130,711,103 0.20 A G 1.3
(1.2-1.39) 5.79E-08 1.31
(1.22-1.41) 9.09E-09 1.3
(1.24-1.37) 2.77E-15 0.01
rs16904140 8 CCDC26 130,734,825 0.21 G A 1.25
(1.16-1.35) 1.41E-06 1.28
(1.19-1.37) 1.14E-07 1.27
(1.2-1.33) 7.88E-13 0.01
rs4295627 8 CCDC26 130,754,639 0.17 T G 1.33
(1.23-1.42) 1.47E-08 1.39
(1.3-1.49) 2.20E-11 1.36
(1.29-1.43) 2.34E-18 0.01
rs1063192 9 CDKN2A/B 21,993,367 0.44 T C 1.21
(1.13-1.29) 1.44E-06 1.21
(1.14-1.29) 6.97E-07 1.21
(1.16-1.27) 4.61E-12 0.81
rs2157719 9 CDKN2A/B 22,023,366 0.57 G A 0.82
(0.74-0.9) 6.80E-07 0.82
(0.75-0.9) 4.42E-07 0.82
(0.77-0.88) 1.41E-12 0.68
rs1412829 9 CDKN2A/B 22,033,926 0.42 T C 1.22
(1.14-1.3) 7.23E-07 1.23
(1.15-1.3) 1.80E-07 1.22
(1.17-1.28) 6.23E-13 0.67
rs4977756 9 CDKN2A/B 22,058,652 0.40 A G 1.25
(1.17-1.32) 2.39E-08 1.24
(1.16-1.31) 5.90E-08 1.24
(1.19-1.3) 7.24E-15 0.94
rs7124728 11 70,554,083 0.44 C T 1.38
(1.3-1.47) 2.28E-13 1.05
(0.97-1.13) 2.24E-01 1.18
(1.13-1.24) 8.18E-09 0.01
rs498872 11 PHLDB1 117,982,577 0.31 C T 1.26
(1.17-1.34) 1.03E-07 1.12
(1.04-1.2) 4.56E-03 1.18
(1.13-1.24) 1.07E-08 0.04
rs6010620 20 RTEL1 61,780,283 0.23 G A 1.28
(1.18-1.38) 8.38E-07 1.28
(1.18-1.38) 6.49E-07 1.28
(1.21-1.35) 2.52E-12 0.38
rs2297440 20 RTEL1 61,782,743 0.22 C T 1.28
(1.18-1.38) 1.01E-06 1.26
(1.16-1.35) 4.44E-06 1.27
(1.2-1.34) 2.06E-11 0.40
UK Casesn = 636HumanHap610Quad. 575,837 SNPs
1958 Birth Cohort Controlsn = 1,438 HumanHap550541,327 SNPs
US Casesn = 1,281HumanHap610 Quad. 575,837 SNPs
521,318 SNPs common across all case and control series
Meta-analysis of 454,576 SNPs
UK-GWA study: 631 cases and 1434 controls
US-GWA study: 1247 cases and 2236 controls
66,742 SNPs removed due to:
• Low MAF (<1% in controls)
• Call rate (< 95% in cases/controls)
• Failed HWE (10-5 level) in controls
Samples removed:
• 4 non-Western Europeans
Samples removed:
• 2 failed genotyping
• 3 non-Western Europeans
Samples removed:
• 9 failed genotyping
• 23 non-Western Europeans
• 2 closely related (IBS > 80%)
Samples removed:
• 7 low call rates
• 2 non-Western Europeans
UK-GWA study US-GWA study
CGEMs Breast Controls n = 1,142HumanHap500 533,334 SNPs
CGEMs Prostate Controlsn = 1,101HumanHap300 + HumanHap240534,379 SNPs
Figure 1: Patient and single SNP exclusion schema for genome-wide studies
CDKN2A CDKN2B
30
15
cM/M
b
rs1063192rs2157719 rs1412829
rs4977756
0
2
4
6
8
10
12
14
16
18
61500 61600 61700 61800 61900 62000 62100
0
2
4
6
8
10
12
14
16
18
70400 70420 70440 70460 70480 70500 70520 70540 70560 70580 70600
cM/M
b
Kb
CR600271
rs10464870
rs6470745
rs4295627
rs16904140
CCDC26 MLZE
-log 1
0P
NKD2 SLC12A7
SLC6A19 SLC6A18 TERT CLPTM1L SLC6A3 LPCAT4
rs2736100
rs2853676
Kb
cM/M
b
Kb
-log 1
0P
20
cM/M
b
Kb
-log 1
0P
rs7124728
90
45
KbKb-lo
g 10P
10
Kb
30
15
cM/M
b
rs6010620rs2297440
KCNQ2 PRPF6
ARFRP1
RTEL1
TNFRSF6B
EEF1A2 PTK6 TPD52L2 UCKL1ZGPATSTMN3
LIME1
(a) 8q24.21 (b) 5p15.33
(c) 9p21.3 (d) 20q13.33
-log 1
0P
rs891835
(e) 11q23.3 (f) 11q13.4
50rs498872
UBE4A
ATP5L ARCN1 TREH BCL9L
MLL PHLDB1 CXCR5DDX6
-log 1
0P
Kb
cM/M
b
25
30
60
0
2
4
6
8
10
12
14
16
18
117500 117600 117700 117800 117900 118000 118100 118200 118300 118400 118500
0
2
4
6
8
10
12
14
16
18
130300 130400 130500 130600 130700 130800 130900 1310000
2
4
6
8
10
12
14
16
18
1000 1100 1200 1300 1400 1500 1600 1700
0
2
4
6
8
10
12
14
16
18
21800 21850 21900 21950 22000 22050 22100 22150 22200
Figure 2: LD structure and association results for the 6 confirmed glioma disease-associated regions: (a) 8q24.21; (b) 5p15.33; (c) 9p21.3; (d) 20q13.33; (e) 11q23.3; and (f) 11q13.4. Chromosomal positions based on NCBI build 36 coordinates, showing Ensemble (release 48) genes. Armitage trend test P values (as –log10 values; left y axis) are shown for SNPs analysed in the GWA-studies in blue and for the combined analysis including the replication series in red. Recombination rates in HapMap CEU across the region are shown in black (right y axis). Also shown are the relative positions of genes mapping to each region of association. Exons of genes have been redrawn to show the relative positions in the gene, therefore maps are not to physical scale.
0
5
10
15
20
25
30
1 2 3 4 5 6 7 8 9 10 11 12Number of risk alleles
Fre
qu
ency
(%
)
ControlsCases
0
0.5
1
1.5
2
2.5
3
3.5
4
<=1 2 3 4 5 6 7 8 9 >=10Number of risk alleles
Od
ds
Rat
io
(a) (b)
Figure 3: (a) Distribution of risk alleles in glioma cases (blue bars) and controls (green bars); (b) Plot of the increasing ORs for glioma with increasing number of risk alleles. The ORs are relative to the median number of five risk alleles; Vertical bars correspond to 95% confidence intervals.
OR 95% CI P
rs2736100 5 230 TT 645 TG 372 GG 546 TT 1103 TG 584 GG 0.82 0.72-2.78 1.01E-04rs2853676 5 109 AA 568 AG 570 GG 149 AA 894 AG 1191 GG 1.28 1.17-3.24 1.35E-05rs10464870 8 683 TT 488 TC 76 CC 1370 TT 761 TC 101 CC 1.26 1.14-3.22 9.62E-05rs891835 8 661 TT 498 TG 88 GG 1344 TT 773 TG 115 GG 1.28 1.16-3.24 2.16E-05rs6470745 8 699 AA 469 AG 79 GG 1415 AA 730 AG 90 GG 1.32 1.2-3.28 4.20E-06rs16904140 8 85 AA 476 AG 686 GG 110 AA 773 AG 1352 GG 1.22 1.11-3.18 5.29E-04rs4295627 8 735 TT 451 TG 60 GG 1496 TT 667 TG 72 GG 1.35 1.22-3.31 1.68E-06rs1063192 9 326 TT 602 TC 319 CC 700 TT 1092 TC 436 CC 1.25 1.15-3.21 7.62E-06rs2157719 9 335 AA 601 AG 311 GG 726 AA 1078 AG 428 GG 0.80 0.7-2.76 6.15E-06rs1412829 9 347 TT 596 TC 302 CC 749 TT 1072 TC 412 CC 1.25 1.16-3.21 5.59E-06rs4977756 9 377 AA 594 AG 276 GG 782 AA 1083 AG 370 GG 1.23 1.13-3.19 3.46E-05rs7124728 11 76 TT 740 TC 431 CC 424 TT 1145 TC 663 CC 1.53 1.42-3.49 2.56E-14rs498872 11 156 TT 589 TC 502 CC 211 TT 954 TC 1070 CC 1.27 1.17-3.23 5.07E-06rs6010620 20 46 AA 405 AG 796 GG 123 AA 785 AG 1327 GG 1.20 1.08-3.16 0.0025rs2297440 20 45 TT 397 TC 804 CC 119 TT 777 TC 1330 CC 1.22 1.09-3.18 0.0016rs7300686 12 307 TT 454 TC 466 CC 586 TT 1074 TC 508 CC 1.36 1.26-3.32 4.72E-10rs17748 11 84 TT 487 TC 675 CC 119 TT 753 TC 1363 CC 1.25 1.14-3.21 1.23E-04rs9656979 8 351 TT 630 TC 266 CC 742 TT 1098 TC 393 CC 1.20 1.1-3.16 3.65E-04rs9369226 6 57 AA 428 AG 762 GG 73 AA 659 AG 1502 GG 0.79 0.67-2.75 1.73E-04rs171125 16 19 AA 249 AG 978 GG 6 AA 359 AG 1838 GG 1.43 1.26-3.39 2.33E-05rs6931798 6 56 TT 426 TC 765 CC 71 TT 656 TC 1500 CC 1.26 1.14-3.22 2.11E-04rs6869535 5 28 AA 307 AG 912 GG 69 AA 682 AG 1483 GG 1.32 1.19-3.28 5.17E-05rs2110922 2 188 TT 579 TG 478 GG 273 TT 980 TG 957 GG 0.85 0.75-2.81 0.0016rs2072532 2 390 TT 606 TC 251 CC 572 TT 1128 TC 533 CC 1.21 1.11-3.17 1.71E-04rs7257116 19 148 TT 581 TC 518 CC 339 TT 1095 TC 796 CC 1.22 1.12-3.18 1.36E-04rs1509937 10 517 AA 552 AG 178 GG 997 AA 995 AG 243 GG 1.15 1.05-3.11 0.0062rs2206920 20 18 AA 239 AG 990 GG 18 AA 342 AG 1850 GG 1.32 1.16-3.28 8.44E-04rs12531711 7 966 AA 262 AG 19 GG 1807 AA 407 AG 20 GG 1.23 1.08-3.19 0.0089rs10488631 7 965 TT 263 TC 19 CC 1810 TT 405 TC 20 CC 1.24 1.09-3.2 0.0063rs10924559 1 29 AA 324 AG 872 GG 52 AA 460 AG 1699 GG 0.80 0.66-2.76 0.0014rs1384847 4 0 AA 170 AG 1077 GG 28 AA 360 AG 1847 GG 1.40 1.21-3.36 3.85E-04rs1941114 18 18 AA 268 AG 961 GG 60 AA 602 AG 1573 GG 0.72 0.58-2.68 8.90E-06rs7325443 13 50 TT 427 TC 770 CC 73 TT 633 TC 1529 CC 1.27 1.15-3.23 1.54E-04rs7325927 13 129 TT 456 TC 659 CC 147 TT 807 TC 1281 CC 0.82 0.71-2.78 3.27E-04
Supplementary Table 1: Genotypes for the 34 SNPs in cases and controls in each of the case-control series. Also shown are Odds ratios andassociated 95% confidence intervals, for each of the SNPs genotyped.
ChrSNP
ResultsCase genotypes Control genotypes
rs2736100 rs2853676 Cases Controls OR 95% CI P
TT GG 678 1'414 1.00 ReferenceTT GA 91 164 1.13 (0.86 - 1.49) 0.37TT AA 3 8 0.93 (0.24 - 3.53) 0.91TG GG 975 1'521 1.33 (1.18 - 1.5) 5.36E-06TG GA 1'158 1'569 1.53 (1.36 - 1.73) 3.17E-12TG AA 60 73 1.67 (1.17 - 2.39) 0.005GG GG 310 449 1.45 (1.22 - 1.72) 2.67E-05GG GA 654 842 1.64 (1.43 - 1.88) 2.75E-12GG AA 372 390 1.97 (1.66 - 2.33) 7.77E-15
Per risk allele 1.08 (1.06-1.09) 1.19E-19P interaction 0.308
rs4295627 rs891835 Cases Controls OR 95% CI P
TT TT 3'349 2'044 1.00 ReferenceTT TG 951 573 1.02 (0.90 - 1.15) 0.76TT GG 82 44 0.90 (0.62 - 1.30) 0.58TG TT 538 384 1.19 (1.03 - 1.38) 0.02TG TG 1'144 946 1.40 (1.25 - 1.54) 4.66E-10TG GG 141 145 1.73 (1.36 - 2.20) 7.82E-06GG TT 16 17 1.89 (0.95 - 3.78) 0.07GG TG 91 82 1.48 (1.09 - 2.01) 0.01GG GG 84 116 2.35 (1.76 - 3.14) 7.04E-09
Per risk allele 1.08 (1.07-1.11) 2.47E-19P interaction 0.03
Supplementary Table 2: Risk of glioma by combined genotypes for rs2736100-rs285
Haplotype*Case
FrequencyControl
FrequencyP
TG 0.409 0.467 4.23E-17GA 0.296 0.247 5.1E-16GG 0.269 0.26 0.1439TA 0.026 0.026 0.733
*Haplotype consists of rs2736100 and rs2853676
Haplotype*Case
FrequencyControl
FrequencyP
TTATGT 0.424 0.462 2.00E-04TTACGT 0.202 0.198 0.6561CGGCAG 0.143 0.099 7.46E-12CGATGT 0.087 0.094 0.2041TTGCAG 0.062 0.059 0.6354TGGCAG 0.02 0.02 0.9459TTGCAT 0.016 0.02 0.1366TTACAT 0.015 0.017 0.4222CTATGT 0.011 0.011 0.9447
*Haplotype consists of rs10464870, rs891835, rs6470745,rs9656979, rs16904140 and rs4295627
53676 and rs4295627-rs891835.
SNP Comparison P Between study P
rs4295627 Glial vs GBM 2.52E-09 0.111Astrocytic vs GBM 5.12E-05 0.013Glial vs astrocytic 0.026 0.016Glial & astrocytic vs GBM 3.80E-08 0.014
rs498872 Glial vs GBM 0.009 0.748Astrocytic vs GBM 0.030 0.145Glial vs astrocytic 0.158 0.406Glial & astrocytic vs GBM 0.005 0.999
rs4977756 Glial vs GBM 0.230 9.60E-05Astrocytic vs GBM 0.0008 0.003Glial vs astrocytic 0.411 0.121Glial & astrocytic vs GBM 0.010 0.001
rs2736100 Glial vs GBM 3.30E-04 0.069Astrocytic vs GBM 1.91E-04 0.068Glial vs astrocytic 0.548 0.172Glial & astrocytic vs GBM 2.55E-05 0.047
rs6010620 Glial vs GBM 0.003 0.429Astrocytic vs GBM 0.138 0.126Glial vs astrocytic 0.070 0.003Glial & astrocytic vs GBM 0.016 0.043
rs7124728 Glial vs GBM 0.030 3.60E-06Astrocytic vs GBM 0.411 5.00E-08Glial vs astrocytic 0.203 1.00E-06Glial & astrocytic vs GBM 0.153 3.60E-10
Supplementary Table 3: Clinico-pathological association testing.Tumors of glial origin defined by ICD-O codes 93803, 94423, 94503, 94513, GBM by ICD-O codes 94401-13, and astrocytic tumors by ICD-O codes 94003-245.
TERT 11q13.4 PHLDB1
rs2853676 rs10464870 rs891835 rs6470745 rs16904140 rs4295627 rs1063192 rs2157719 rs1412829 rs4977756 rs7124728 rs498872 rs6010620 rs2297440
rs27361000.31
(10,731)0.68
(10,800)0.73
(10,667)0.74
(10,850)0.30
(10,836)0.30
(10,846)0.34
(10,774)0.20
(10,825)0.41
(10,810)0.27
(10,811)0.19
(10.725)0.55
(10,825)0.60
(10,763)0.63
(10,623)
rs28536760.26
(10,758)0.78
(10,629)0.38
(10,808)0.41
(10.798)0.41
(10,803)0.74
(10,739)0.85
(10,782)0.97
(10,777)0.40
(10,765)0.56
(10,703)0.20
(10,783)0.66
(10,728)0.56
(10,598)
rs104648700.62
(10,705)5.57x10-5
(10,887)3.92x10-12
(10,880)0.02
(10,880)0.86
(10,813)0.80
(10,865)0.55
(10,848)0.73
(10,848)0.73
(10,765)0.18
(10,860)0.94
(10,797)0.86
(10,661)
rs8918351.56x10-4
(10,753)1.45x10-4
(10.740)0.03
(10,747)0.77
(10,677)0.71
(10,728)0.53
(10,713)0.67
(10,711)0.55
(10,625)0.24
(10,729)0.33
(10,668)0.22
(10,572)
rs64707450.33
(10,926)0.48
(10,933)0.17
(10,861)0.19
(10,911)0.51
(10,900)0.33
(10,898)0.36
(10,806)0.39
(10,912)0.73
(10,846)0.61
(10,711)
rs169041400.26
(10,923)0.41
(10,852)0.49
(10,898)0.96
(10,888)0.50
(10,884)0.41
(10,798)0.41
(10,903)0.89
(10,836)0.78
(10,703)
rs42956270.38
(10,859)0.39
(10,911)0.82
(10,896)0.49
(10,894)0.55
(10,806)0.44
(10,909)0.75
(10,846)0.73
(10,708)
rs10631920.60
(10,838)0.58
(10,827)0.76
(10,831)0.14
(10,736)0.73
(10,839)0.87
(10,772)0.97
(10,636)
rs21577190.54
(10,874)0.54
(10,874)0.23
(10,789)0.70
(10,889)0.80
(10,823)0.87
(10,687)
rs14128290.59
(10,857)0.26
(10,771)0.44
(10,873)0.60
(10,814)0.65
(10,674)
rs49777560.12
(10,777)0.29
(10,869)0.70
(10,808)0.84
(10,670)
11q13.4 rs71247280.29
(10,869)0.25
(10,720)0.22
(10,588)
PHLDB1 rs4988720.19
(10,820)0.12
(10,690)
RTEL1 rs60106200.28
(10,632)
Supplementary Table 4: Pairwise analysis of all SNPs associated with glioma risk. For each row-column combination, numbers show the P value forinclusion of an interaction term between the two SNPs. Numbers in parentheses are the number of samples from which statistics are calculated.
CDKN2A/B
RTEL1CCDC26
TERT
CCDC26
CDKN2A/B
Number of risk alleles Contols (%) Cases (%) OR (95% CI)
0-1 33 (0.52) 8 (0.19) 0.45 (0.21-0.99)2 150 (2.38) 32 (0.78) 0.4 (0.27-0.59)3 483 (7.67) 184 (4.48) 0.71 (0.59-0.86)4 1016 (16.13) 489 (11.9) 0.9 (0.78-1.03)5 1542 (24.48) 825 (20.08) 1.0 (0.89-1.13) Reference6 1520 (24.13) 1010 (24.59) 1.24 (1.11-1.39)7 954 (15.15) 902 (21.96) 1.77 (1.56-2)8 417 (6.62) 434 (10.56) 1.95 (1.66-2.28)9 155 (2.46) 174 (4.24) 2.1 (1.66-2.65)
10+ 29 (0.46) 50 (1.22) 3.22 (2.02-5.13)Total 6299 (100) 4108 (100) 1.24 (1.21-1.28) per allele
P trend = 1.30 x 10-61
Supplementary Table 5: Odds ratios corresponding to increasing numbers of risk alleles in rs4295627 (CCDC26 ), rs498872 (PHLDB1 ), rs497756 (CDKN2A, CDKN2B ), rs2736100 (TERT ), rs6010620 (RTEL1 ) and rs7124728 (11q13.4). The median number of risk alleles, 5, is used as the reference group for the odds ratios.
Region Gene nsSNP Change5p15.33; rs2736100 TERT D' v rs2736100 r2 v rs2736100
rs35719940 T1062A na nars34062885 R948S na nars34094720 Y412H na nars11952056 T191S na na
SLC8A1 rs5557 V692E na na
9p21.3; rs4977756 CDKN2A D' v rs4977756 r2 v rs4977756rs45476696 S153G na nars3731249 T148A low MAF low MAFrs6413464 S127A na na
rs34170727 C124R na nars6413463 Q123H na nars35741010 T102A na nars34886500 W99R na nars11552822 Y84D na nars34968276 Q83H na nars11552823 L81P na nars36204594 V60A na nars36204273 Q58R na na
11q23.3; rs498872 PHLDB1 D' v rs498872 r2 v rs498872rs35842472 S347T na nars35219502 I477T na nars11216935 F687L na nars2298484 H868A na nars12787512 C985S na nars497554 W1192R na na
D' v rs498872 r2 v rs498872TREH rs6589669 S578* na na
rs6589670 A561P na nars6589671 A558P low MAF low MAFrs2276064 W486R low MAF low MAFrs11827611 H449Y na nars2276065 A389T na nars34978247 R140K na na
20q13.33; rs6010620 RTEL1 D' v rs6010620 r2 v rs6010620rs3848668 S124N 1 0.054rs16983886 N659K na nars16983889 I660M na nars35640778 Q684R na nars12480603 D870E na nars6089959 S951G na nars3208008 C1042Q na na
ARFRP1 rs35037612 E135K na nars35892492 K105E na nars11547196 C74W na na
ZGPAT rs1291212 R61S 0.758 0.031rs6089764 L105P na na
rs17855481 R413G na na
LIME1 rs1151625 L211P 0.758 0.031
Supplementary Table 6: Correlations between the SNPs and nsSNPs mapping withgenes.
i
in nearby
(a) (b)
(c) (d)
Supplementary Figure 1: Identification of individuals in the GWA scan of non-European ancestry. The first two principal components of the analysis were plotted. HapMap CEU individuals are plotted in red; CHB+JPT are plotted in blue; YRI individuals are plotted in black; individuals in the GWA-UK and GWA-US are plotted in grey. GWA-US controls are plotted in (a), cases in (b); GWA-UK controls are plotted in (c) and cases in (d).
(a) (b)
Supplementary Figure 2: Quantile-quantile (Q-Q) plot of Cochran-Armitage test for trend for the two GWA studies. Results from UK-GWA are plotted in (a) and those from the US-GWA in (b). The black line represents the null hypothesis of no true association.
Kb
30
15
cM/M
b
rs6010620rs2297440
KCNQ2
PRPF6
ARFRP1
(b) 5p15.33
0
2
4
6
8
10
12
14
16
18
1000 1100 1200 1300 1400 1500 1600 1700
0
2
4
6
8
10
12
14
16
18
130300 130400 130500 130600 130700 130800 130900 131000
0
2
4
6
8
10
12
14
16
18
21800 21850 21900 21950 22000 22050 22100 22150 22200
cM/M
b
Kb
CR600271
rs10464870
rs6470745
rs4295627
rs16904140
CCDC26 MLZE
-log 1
0P
NKD2 SLC12A7
SLC6A19 SLC6A18 TERT CLPTM1L SLC6A3 LPCAT4
rs2736100
rs2853676
Kb
cM/M
b
Kb
-log 1
0P
90
45
KbKb-lo
g 10P
RTEL1
TNFRSF6B
EEF1A2 PTK6 TPD52L2 UCKL1ZGPATSTMN3
LIME1
(a) 8q24.21
CDKN2A CDKN2B
30
15
cM/M
b
rs1063192rs2157719 rs1412829
rs4977756
(c) 9p21.3 (d) 20q13.33
-log 1
0P
rs891835
30
60
0
2
4
6
8
10
12
14
16
18
61500 61600 61700 61800 61900 62000 62100
cM/M
b
Kb
(e) 11q23.3
50
KbcM
/Mb
25
0
2
4
6
8
10
12
14
16
18
70400 70420 70440 70460 70480 70500 70520 70540 70560 70580 70600
20
-log 1
0P
rs7124728
10
(f) 11q13.4
0
2
4
6
8
10
12
14
16
18
117500 117600 117700 117800 117900 118000 118100 118200 118300 118400 118500
UBE4A
rs498872
ATP5L ARCN1 TREH BCL9L
MLL PHLDB1 CXCR5DDX6
-log 1
0P
Supplementary Figure 3: Regional plots of the 6 confirmed associations (a-f). For each of the regions: (a) 8q24.21; (b) 5p15.33; (c) 9p21.3; (d) 20q13.33; (e) 11q23.3; and (f) 11q13.4 the upper panel shows single marker association statistics (as –log10 values) from the GWA-studies in blue and for the combined analysis including the replication series in red as a function of genomic position (NCBI build 36.1). Also shown are the relative positions of genes mapping to each region of association. Exons of genes have been redrawn to show the relative positions in the gene, therefore maps are not to physical scale. In the lower panel are the estimated statistics of the square of the correlation coefficient (r2), derived from HapMap. The values indicate the LD relationship between each pair of SNPs; the darker the shading, the greater extent of LD.
0
1
2
3
4
5
6
7
8
9
130000 130200 130400 130600 130800 131000 131200
Location (kb)
-log 1
0P
rs1077236
rs6470745
rs4295627
Supplementary Figure 4: Imputation of 8q24.21 SNPs. In total, 894 HapMap SNPs were successfully imputed in the interval between 130,051,729 and 131,225,253bps using available SNP genotype data from GWA scans (225 SNPs). Imputed data integrity was verified where possible, by crosschecking the concordance of imputed genotypes with that of available Illumina SNP genotype data. Directly genotyped SNPs are denoted in blue, those imputed are shown in green.
Supplementary Figure 5: 5p15.33 (TERT), 8q24.21 (CCDC26), 9p21 (CDKN2A, CDKN2B), 11q23.3 (PHLDB1) and 20q13.33 (RTEL1, ARFRP1, STMN3, SLC2A4RG, ZGPAT) copy number variation (CNV) and mRNA expression in glioma. Plots show mRNA expression stratified by CNV status based on analysis of 188 glioblastoma multiforme (GBM) tumors. Expression data was derived from the MSKCC Cancer Genomics Data Portal (http://cbio.mskcc.org/cancergenomics-dataportal/). MSKCC expression data was generated by combining GBM expression data from the The Cancer Genome Atlas. The bioconductor module CGHcall was used to determine CNV status from Agilent human genome CGH microarray data taken from TCGA data portal. It should be noted that rates of amplification and deletion for a set of randomly selected probes are 9.9% and 13.4% respectively in the tumors. Differences in the distribution of expression by SNP genotype were compared using a Wilcoxon-type test.
Le
vel
of
gen
e ex
pre
ssio
n
rs2736100 genotype
TERT
rs2297440 genotype
Le
vel
of
gen
e ex
pre
ssio
n
Le
vel
of
gen
e ex
pre
ssi
on
Lev
el o
f g
ene
exp
ress
ion
Lev
el o
f g
ene
exp
ress
ion
Lev
el o
f g
ene
exp
ress
ion
Lev
el o
f g
ene
exp
ress
ion
rs6010620 genotype
rs6010620 genotype
rs6010620 genotype rs2297440 genotype
rs2297440 genotype
STMN3
SLC2A4RGSLC2A4RG
ARFRP2 ARFRP2
STMN3
P = 0.53
P = 0.25
P = 0.25
P = 0.90 P = 0.90
P = 0.60 P = 0.60
26 46 18
49 356 6
35 49
635
494935
6
635
4949 35
6
(a)
Le
vel
of
gen
e ex
pre
ssi
on
Lev
el o
f g
ene
exp
ress
ion
Lev
el o
f g
ene
exp
res
sio
n
Lev
el o
f g
ene
exp
res
sio
n
Lev
el o
f g
ene
exp
ress
ion
Lev
el o
f g
ene
exp
ress
ion
Lev
el o
f g
ene
exp
ress
ion
Lev
el o
f g
ene
exp
ress
ion
rs2157719 genotype rs1412829 genotype
rs4977756 genotype rs1063192 genotype rs2157719 genotype
rs1412829 genotype rs4977756 genotype
rs1063192 genotype
CDKN2A CDKN2ACDKN2A
CDKN2A CDKN2B CDKN2A
CDKN2ACDKN2A
P = 0.91
P = 0.74
P = 0.53
P = 0.85
P = 0.39
P = 0.46
P = 0.90
P = 0.59
12 45 33 35 45 10 1042 38
39 438
12 45 33 35 45 10
1042 38 39 43
8
56 293
29
6 5 4734
3355
5811 2
P = 0.89
P = 0.72 P = 0.28
P = 0.26P = 0.79
Le
vel
of
gen
e ex
pre
ssio
n
Lev
el o
f g
ene
exp
res
sio
n
Lev
el o
f g
ene
exp
res
sio
n
Lev
el o
f g
ene
exp
ress
ion
Lev
el o
f g
ene
exp
ress
ion
PHLDB1
CDKN2ACDKN2B
CDKN2B CDKN2A
rs17748 genotype
rs10965224 genotype rs10965224 genotype
rs564398 genotype rs564398 genotype
(b)
107 78
822 14
6
107 78 8107
758
107 78 8
P = 0.25P = 0.98
P = 0.008
P = 0.10
P = 0.32
Lev
el o
f g
ene
exp
res
sio
n
Lev
el o
f g
ene
exp
res
sio
n
Le
vel
of
gen
e ex
pre
ssi
on
Lev
el o
f g
ene
exp
ress
ion
Lev
el o
f g
ene
exp
ress
ion
ARFRP1
STMN3 SLC2A4RG
TNFRSF6B – 1st probe TNFRSF6B – 2nd probe
rs2297441 genotype
rs2297441 genotype rs2297441 genotype
rs2297441 genotype rs2297441 genotype
711
18
TNFRSF6B – 2nd probe
Lev
el o
f g
ene
exp
ress
ion
rs2738758 genotype
14 57 95
14 57 95 1454 95
P = 0.12
P = 0.362
P = 0.06
P = 0.308
Le
vel
of
gen
e ex
pre
ssi
on
Lev
el o
f g
ene
exp
ress
ion
Lev
el o
f g
ene
exp
ress
ion
ARFRP1 STMN3 SLC2A4RG
rs2738758 genotype
1457 95
P = 0.40
TNFRSF6B – 1st probe
rs2738758 genotype
rs2738758 genotype rs2738758 genotype
Supplementary Figure 6: Relationship between (a) lymphocyte and (b) normal cortical mRNA expression levels of PHLDB1 and rs17748 genotype, CDKN2A/CDKN2B and rs10965224/rs564398/rs1063192/rs2157719/rs1412829/rs49777656 genotype, ARFRP1/STMN3/SLC2A4RG/TNFRSF6B and rs2297441/rs2738758/rs2297440/rs6010620 genotype and TERT and rs2736100 genotype. Expression of PHLDB1, CDKN2A, CDKN2B, ARFRP1, STMN3, SLC2A4RG and TNFRSF6B in lymphocytes based on expression data from analysis of 90 Epstein-Barr virus–transformed lymphoblastoid cell lines using Sentrix Human-6 Expression BeadChip (Illumina, San Diego, USA)24,25. Expression of PHLDB1, CDKN2A, CDKN2B, ARFRP1, STMN3, SLC2A4RG and TNFRSF6B in normal human cortex based on expression data from analysis of 193 normal human brain samples using Affymetrix GeneChip Human Mapping 500K Arrays and Illumina HumanRefseq-8 Expression BeadChip platforms26. Expression and/or genotype information was not available for every sample and thus the total number of samples may be <193. Where SNPs genotyped in our study were not typed HAPMAP data was used to identify correlated SNPs in high LD (r2>0.8). The number of samples with each genotype is given for each plot. Combining the lowest frequency homozygote with the heterozygote and re-performing the statistical test produced no further significant results. Differences in the distribution of expression by SNP genotype were compared using a Wilcoxon-type test for trend.
Lev
el o
f g
ene
exp
ress
ion