conrad et al 2007 gene duplication
TRANSCRIPT
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
1/21
Gene Duplication: A Drivefor Phenotypic Diversity andCause of Human Disease
Bernard Conrad1,2 and Stylianos E. Antonarakis1
1Department of Genetic Medicine & Development, University of Geneva MedicalSchool and Geneva University Hospitals, CH-1211 Geneva 4, Switzerland
2Division of Human Genetics, Bern University Childrens Hospital, CH-3010 Bern,Switzerland; email: [email protected]
Annu. Rev. Genomics Hum. Genet. 2007. 8:1735
First published online as a Review in Advance onMarch 26, 2007.
TheAnnual Review of Genomics and Human Geneticsis online at genom.annualreviews.org
This articles doi:10.1146/annurev.genom.8.021307.110233
Copyright c2007 by Annual Reviews.All rights reserved
1527-8204/07/0922-0017$20.00
Key Words
gene duplication, copy number variant, haploinsufficiency, gene
balance hypothesis, insufficient amount hypothesis
Abstract
Gene duplication is one of the key factors driving genetic inno-vation, i.e., producing novel genetic variants. Although the con
tribution of whole-genome and segmental duplications to pheno-typic diversity across species is widely appreciated, the phenotypic
spectrum and potential pathogenicity of small-scale duplications inindividual genomes are less well explored. This review discusses
the nature of small-scale duplications and the phenotypes produced by such duplications. Phenotypic variation and disease phe-
notypes induced by duplications are more diverse and widespreadthan previously anticipated, and duplications are a major class o
disease-related genomic variation. Pathogenic duplications particu-larly involve dosage-sensitive genes with both similar and dissimilar
over- and underexpression phenotypes, and genes encoding protein
with a propensity to aggregate. Phenotypes related to human-specificcopy number variation in genes regulating environmental response
and immunity are increasingly recognized. Small genomic duplications containing defense-related genes also contribute to complex
common phenotypes.
17
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
2/21
INTRODUCTION
Ever since Susumu Ohnos insightful sugges-tion 35 years ago that gene duplication is a key
factor shaping evolution, the model and itsgeneral predictions continue to attract much
attention (70). On an evolutionary scale,gene duplication may result in new functions
via different scenarios (Figure 1). Althoughthe most likely outcome is loss of function
in one of the two gene copies (Figure 1a,
Gene loss = nonfunctionalization
Functional divergence
a
b
No functional divergence = genetic robustnessc
Neofunctionalization Subfunctionalization
d
e
Duplication of gene families
Concerted evolution
Gene
conversion
Birth-and-death evolution
Silent nucleotide substitutionsversus
inactivating mutations
Duplication by retrotransposition
Male germline function
Somatic function
Figure 1
Evolutionary fate of single gene duplications (ac), and duplication of multigene families (de). Singlegene duplication most often results in a nonfunctional duplicate gene copy (a, nonfunctionalization).(b) In rare instances, the functional duplicate gene copy and the ancestral gene diverge in function;
neofunctionalization means that one of the two genes retains the original function, while the otherevolves a new, often beneficial function. Subfunctionalization implies that both the original and theduplicate genes mutate and evolve to fulfill complementary functions already present in the original geneDuplication via retrotransposition represents a particular case of sub- or neofunctionalization. Multigenefamilies evolve in a coordinated fashion, such that the DNA coding sequences and function of the singlemembers of a family remain close to that of the ancestral gene (de). (d) Concerted evolution: Aftermultiple rounds of duplication, gene conversion homogenizes the DNA sequences of the individualmembers. (e) Birth-and-death evolution invokes a process of equilibrium between inactivating mutationsand ongoing duplication of functional gene copies.
nonfunctionalization), in rare instances one
gene copy may retain the original functionwhile the other acquires a novel, evolu-
tionarily advantageous (adaptive) function(Figure 1b, neofunctionalization) (30). Al-
ternatively, after duplication, mutations mayoccur in both genes that specialize to per-
form complementary functions (Figure 1b
subfunctionalization) (56, 57). The ques-tion of how duplicate genes are retained
18 Conrad Antonarakis
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
3/21
in a population remains controversial.
Classical duplication-degeneration-comple-mentation/subfunctionalization models do
not invoke positive selection, but stipulatea higher retention rate of duplicate genes in
small rather than larger populations. Con-siderably more retentions and fewer losses of
duplicate genes in rodents as compared withhumans indicate that positive selection mayplay a more important role than originally
anticipated (91). If two redundant gene copieswere retained in the genome without signifi-
cant functional divergence, the organism mayacquire increased genetic robustness against
harmful mutations (Figure 1c). In multigenefamilies descended from a common ancestor,
individual genes in the group exert similarfunctions and have similar DNA sequences
(67, 68). One concept, concerted evolution,applies particularly to localized and typically
tandem copies of a gene. The concept
posits that all genes in a given group evolvecoordinately, and that homogenization is
the result of gene conversion (Figure 1d).For most multigene families, the currently
favored model is birth-and-death evolution,according to which similarity in protein
sequence among the members of a family isassured by strong purifying selection, such
that individual genes evolve essentially viasilent synonymous nucleotide substitutions
(67, 68) (Figure 1e). Inactivating mutationsin a single member of a multigene family
do not necessarily imply an evolutionary
dead end, as illustrated by the more is lesshypothesis (72). For instance, the human
CASP12 pseudogene located in a clusterof functional Caspase genes shows that a
protein-truncating mutation can be positivelyselected, probably because the variant allele
confers resistance to severe sepsis (105,108). A recently recognized primate-specific
subgroup of duplications generated by retro-transposition was recruited to enhance male
germline function (Figure 1b) (103). Thismechanism thus represents a variation of
neo- or subfunctionalization. Irrespective of
these alternatives, gene duplication is, along
Whole-genomeduplication and 2Rhypothesis:increased complexityand genome size ofvertebrates resulted
from two rounds(2R) ofwhole-genomeduplication duringearly vertebrateevolution
Small-scaleduplications: arecent gene familyexpansion bysegmental or tandemduplication50150 Mya thatmay have followedon one singleprecedent round(1R) of WGD
Gene balancehypothesis: positsthat an imbalance inthe concentration ofindividualcomponents of amultiproteincomplex isdeleterious
with alternative splicing, a major mechanism
of gene diversification (48, 99). For instance,along the lineage leading to humans, 689
genes were gained and 86 genes lost sincethe split from chimpanzees, contributing
to a 6% difference in the complement ofgenes between humans and chimpanzees
(1418 of 22,000 genes), largely outnumber-ing the 1.5% nucleotide difference betweenorthologous sequences of the two species (23).
Whole-Genome versus Small-ScaleDuplications
Recent analysis of eukaryotic genomes showsthat gene duplication is widespread (50, 56).
A series of arguments suggest that currentvertebrate genomes have been shaped by
two rounds of whole-genome duplication (22,100). An alternative hypothesisfavors one sin-gle round of whole-genome duplication plus
continuous small-scale duplications (35). Us-ing Arabidopsis as a model system that un-
derwent several rounds of complete genomeduplication, the fate of small-scale versus
whole-genome duplication was addressed.This analysis not only revealed the dominant
evolutionary impact of whole-genome dupli-cation, but also the important role of small-
scale duplications for gene categories relatedto metabolism, stress responses,and celldeath
(58). The functional gene categories that have
been selectively maintained by the two mech-anisms differ somewhat (20). For instance, a
group of genes also referred to as connectedgenes, encoding among other transcriptional
regulators with limiting downstream part-ners that participate in protein-protein in-
teractions, and interacting proteins that arepart of signal transduction pathways (31), are
consistently underrepresented among small-scale duplications, and overrepresented after
whole-genome duplication (76). Duplicationof such dosage-sensitive genes, required at
stoichiometrically precise levels, may be
tolerated in whole-genome, but not insmall-scale, duplications (20). These findings
comply with the gene balance hypothesis
www.annualreviews.org Gene Duplication 19
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
4/21
Recent segmentalduplications:sequences that are>1 kb in length andthat show>90%sequence identity
Breakpoint inconserved synteny:syntenic segments orblocks 100 kb withbreakpointsidentified as changein orientation orchromosomelocation based onunique regions of thegenome
Copy number
variant (CNV) orcopy numberpolymorphism(CNP): a DNAsegment1 kbpresent at variablecopy numbercompared with areference genome
Copy numbervariable region(CNVR): CNVsidentified in
individuals of theHapMap collection,called in more thanone individual, andreplicated by asecond independentplatform
(9, 10, 31, 76, 102), which posits that con-
nected genes are particularly dosage sensitive,and are expected to be overretained relative
to nondosage-sensitive genes after whole-genome and larger-scale duplications, but not
after local duplications.
Recent Segmental Duplications
Recent segmental duplications comprise
about 5% of the euchromatic portion of thehuman genome (26). Although one current
definition includes sequences that are >1 kbin length and that show>90% sequence iden-
tity (6, 16), other authors use the term vari-ably, giving rise to different numbers of du-
plicated segments in the genome. Segmentalduplications are particularly enriched at peri-
centromeric and subtelomeric regions; sub-telomeric duplications account for 40% (51),and pericentromeric duplications for 33%
of the total (90). A considerable fraction(25%) of the nonpericentromeric and non-
subtelomeric duplications are associated withsyntenic breaks (4, 5), and 98% of primate-
specific breakpoints contain segmental du-plications (66). This was recently confirmed
for the duplication-rich human chromosomes15 (113) and 17 (114) (HSA15, HSA17).
On these chromosomes, the human-specificbreakpoints of conserved synteny (breakpoint
identified as changein orientation or chromo-
some location based on unique regions of thegenome, and for syntenic segments 100 kb)
(5) occur mostly in regions containing dupli-cations; 13 out of 15 breakpoints on HSA15
contain duplications, and 74% of duplicatedbases on HSA17 reside in breakpoints. At least
50% of duplications are strictly intrachromo-somal [50% for HSA15 (110) and 62% for
HSA17 (111)]. Between 3.5% and 11% ofall duplications contain complete genes (110),
and nearly one third of duplicated genes arearrayed in tandem (92). An unexpected fea-
ture of areas with tandem duplications is theirasynchronous replication, suggesting that du-
plicate structures alter the epigenetic state of
a given locus (33). Ultraconserved elements
(UCEs) arerare in segmental duplications and
copy number variants (CNVs), except for theUCE-subclass that overlaps exons (24). The
mechanisticbasis for this exquisite intoleranceto copy number changes of UCEs awaits fur-
ther characterization.At least 30% of the pericentromeric du-
plications were duplicatively transposed fromeuchromatic regions, generating a minimum
of 28 new transcripts that are expressed pri-
marily in the testis (90). The overall propor-tion of duplications is highest close to cen-
tromeres, and gradually diminishes within adistance of5 Mb from centromeres. Con-
versely, gene density increases with distancefrom centromeric -satellite repeats. The
subtelomeres contain 25 small gene familiesorganized in tandemly repeated blocks shar-
ing extensive sequence similarity, and 18 ofthese have at least one functional member
(51). Gene products within these blocks ex-ert highly varied functions, but are predomi-
nantlyodorant-andcytokine-receptors,tubu-
lins, and transcription factors.Several recent studies revealed the
widespread presence of large-scale CNVsin the normal human population (85, 88)
Deletions and insertions were equally rep-resented, often occurring around regions
of chromosomal instability (88). In onestudy, 70 different genes were shown to
vary in copy number within CNVs (85)Similarly, extensive CNVs in subtelomeric
blocks were documented (51). Approxi-mately 1450 copy number variable regions
(CNVRs) encompassing 360 Mbs, or 12%
of the genome, were mapped through thestudy of 270 individuals from the HapMap
collection (80), and 5150 CNVs genome-wide are currently recorded in the available
databases [http://paralogy.gs.washingtonedu/structuralvariation/ (85), http://
projects.tcag.ca/variation/(43), http://www.som.soton.ac.uk/research/geneticsdiv/
anomaly%20register/]. Within humanCNVs, a significant overrepresentation
of genes associated with environmentallyregulated functions and immunity was found
20 Conrad Antonarakis
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
5/21
suggesting an adaptive advantage of dosage
imbalance in these regions (69). Mecha-nistically, duplications that share extensive
sequence similarity, called low copy repeats(LCRs), occur frequently in regions with
large duplicated stretches. One commondefinition of LCRs is sequences that are
10 kb in length, show 95% sequenceidentity, and are separated by 50 kb10 Mbof intervening sequence (97). Such LCRs
serve as substrates for nonallelic homologousrecombination (NAHR) that can result in
duplications, reciprocal deletions, inversions,and reciprocal translocations (83, 89). The
primate-specific burst of Alu-repeats 3540 million years ago (Mya) could have been
one critical event initiating segmental geneduplications (7). Consistent with this, the
generation and structure of LCRs appear tobe associated with Alu-elements (89), and 492
human-specific deletions can be attributed to
this process (87). A detailed analysis of theproximal short arm of HSA17 indicated that
NAHR between LCRs is a major mechanismof recurrent rearrangements, whereas nonho-
mologous end-joining (NHEJ) can, in manyinstances, be responsible for nonrecurrent
rearrangments (54). The prevailing molecularmechanism also depends on the chromosomal
position, because subtelomeric duplicationswere generated almost exclusively via NHEJ
of double-strand breaks (51).Functional, duplicated genes comprise an
important, rapidly evolving euchromatic frac-
tion of our genome that displays extensivepolymorphism; it is therefore importantto ex-
amine their contribution to phenotypic vari-ation and disease.
Evolutionary Forces and Duplicated
Genes
The evolutionary forces acting on duplicatedgenes are diverse. A number of interdepen-
dent variables determine whether a gene will
be retained after duplication; these includeits functional category (46, 61, 76), degree
of conservation (18, 19, 45), sensitivity to
Low copy repeats(LCRs): sequencesthat are 10 kb inlength that show95% sequenceidentity, and are
separated by50 kb10 Mb ofintervening sequence
Haploinsufficientgenes: subset ofgenes experiencing aloss of fitness whenpresent in a singlecopy in diploidspecies
dosage effects (46), as well as its regulatory
and architectural complexity (39). In general,genes encoding proteins that interact with
the environment tend to be more frequentlyretained after duplication than those that
interact with intracellular compartments (50,61). Genes with permanent duplicates in
the Caenorhabditis elegans and Saccharomycescerevisiae genomes are more constrainedbefore duplication than genes that never
duplicated (19). However, human CNVs areconsistently enriched in genes with increased
rates of synonymous and nonsynonymouscodon substitutions (69). Haploinsufficient
genesi.e., genes experiencing a loss offitness when present in a single copy in
diploid specieshave more paralogs thanhaplosufficient genes, supporting the concept
that gene dosage may be critical in fixing du-plications (gene balance hypothesis) (46). In
general, duplicated genes encode for longer
proteins containing more domains and morecis-regulatory elements than singleton genes
(39). These observations indicate that naturalselection created a preferential association of
duplications with certain gene categories.Experiments performed on multiple
species, from unicellular organisms to mam-mals, indicate that genes rapidly diverge
after duplication (Figure 2) (17, 47). Al-though divergence of duplicated genes was
traditionally assessed by comparing the rateof amino acid changes, more recently the
focus was on divergence of gene expres-
sion, of transcriptional networks, and ofprotein-protein interactions. It has been
shown that the protein-coding sequence andcis-regulatory regions of duplicated genes
evolve independently (Figure 2a) (104).Shortly after duplication, expression and
regulatory divergence exceed changes inthe protein sequence (36). In humans and
yeast, protein sequence evolution and geneexpression diversity significantly correlate
and increase with evolutionary time (37, 60).In addition, the mode of duplication and
the functions of the genes involved also play
important roles in expression divergence
www.annualreviews.org Gene Duplication 21
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
6/21
b1 Protein-sequence divergence Regulatory divergencec
Protein-network divergence
cis-regulatory
divergence
t
(Evolutionary time)
acis-regulatory region
Protein-coding region
b2
Figure 2
Functional divergence of duplicated genes. (a) Thecis-regulatory and the protein-coding regions evolveindependently after duplication. Divergence increases with evolutionary distance. (b) Schematicrepresentation of (b1) protein sequence divergence after duplication (green sequence diverges intoblueororange); (b2) protein network divergence, where the protein interaction domains of the original greensequence evolve by maintenance, gain, or loss of interacting partners. (c) Schematic representation ofDNA sequence regulatory divergence after duplication (certain regulatory motifs are lost in one copy ofthe duplicated gene sequence).
(13). Divergence of cis-regulatory motifs in
the promoter-proximal region (Figure 2c) isprobably not the only substrate for expres-
sion divergence, suggesting thattrans-actingfactors could also be important (111). In
humans, genes diverge asymmetrically afterduplication; they rapidly lose their original
coexpressed partners and acquire new ones(Figure 2c) (17). Expression levels of du-
plicate genes diverge significantly duringdevelopment within and between species,
as compared to single-copy genes (37).
Paralogous genes in humans and mice tend tobecome more specialized in their expression
patterns (42).The protein interaction partners
(Figure 2b) change at a slower rate than thetranscription factors shared by duplicated
genes (62). The divergence of protein-protein
interactions after duplication depends on the
connectivity [i.e., number of binding partnersof a protein (8)] of the ancestral gene (112)
Proteins with a higher ancient connectivitytend to display an asymmetrical evolution
of the duplicates, whereas duplicates witha lower connectivity tend to gain and lose
interacting partners at about the same rate. Inconclusion, gene expression divergence is a
key substrate for functional divergence of du-plicate genes, and gene regulatory networks
evolve at a higher rate after duplication than
protein-protein interactions.
The Phenotypic Spectrumof Duplicated Genes
Gene dosage effects. The phenotypic con-
sequence of duplicating large regions or even
22 Conrad Antonarakis
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
7/21
whole chromosomes is at least in part de-
termined by the extent of regulatory imbal-ances. It is a priori expected that duplicate
genes (three copies per diploid genome) willexhibit a 1.5-fold increase in mRNA expres-
sion. Consistent with the prediction, 50%of trisomic genes are overexpressed at the ex-
pected 1.5 level or higher; this is true forgenes overexpressed as a consequence of anadditional whole chromosome copy in hu-
man trisomy 21 and in the respective trisomicmouse models (81). Similarly, a recent com-
parison of chimpanzee and human segmentalduplications revealed that among the human-
specific duplicates (causing trisomies), 56%show a significant difference in gene expres-
sion between the two species, mostly (83%)overexpression in human as compared to
chimpanzee (15).For genes that fail to show such a dose-
dependent increase, compensatory mecha-
nisms are usually involved (i.e., dosage com-pensation). For instance, this can happen
when the regulator controlling transcriptionand the target gene reside together on an
aneuploid segment, canceling dosage imbal-ances (9). Conversely, duplicated regions con-
taining positive or negative regulators canproduce negative effects on target genes lo-
cated outside of the aneuploid area (9, 10).The main conclusion from this is that the
most significant alteration in gene expressioncaused by aneuploidy will be exerted by the
target genes that are not in the aneuploid
region.In aneuploidy, it is generally assumed that
only a restricted set of dosage-sensitive genesis responsible for the phenotype. In a sys-
tematicanalysis of overexpression phenotypesin yeast, 15% of overexpressed genes re-
duced cell growth, and among those, cellcycle-regulated genes, signaling molecules,
and transcription factors were enriched (96).Interestingly, the overexpression phenotypes
differed from the deletion mutant pheno-types, indicating that the underlying mech-
anisms are specific for under- and overex-
pression, respectively, rather than resulting
UNDER- AND OVEREXPRESSIONPHENOTYPES: THE GENE BALANCEHYPOTHESIS VERSUS THE INSUFFICIENT
AMOUNT HYPOTHESIS
Another extrapolation of the gene balance hypothesis is that
under- and overexpression phenotypes are identical, or at leastsimilar, as they both act via the disruption of an identical mul-
timeric, regulatory protein complex (disrupted stoichiometryof protein subunits encoded by connected genes).
The primary mechanism of haploinsufficiency in yeast is
an insufficient protein production, which led to the formula-tion of the insufficient amount hypothesis, contradicting the
predictions of the balance hypothesis. In further support ofthe insufficient amount hypothesis is the fact that under- and
overexpression phenotypes of bona fide connected genes inyeast differ, indicating specific regulatory imbalances, rather
than disrupted stoichiometry (see References 25 and 96).
from a common disruption of protein com-plex stoichiometry. These results contradict
the gene balance hypothesis and are consis-
tent with the insufficient amounts hypothe-sis (25), in which haploinsufficient genes are
needed at abnormally high levels, and there-fore are more sensitive to a reduction in dose
(see sidebar on Under- and OverexpressionPhenotypes: The Gene Balance Hypothesis
versus the Insufficient Amount Hypothesis formore information).
Relationship of gene dosage and fitness
(phenotype). The relationship betweengene dosage and fitness (phenotype) is
complex. Three alternatives have beenproposed, each one characteristic of certain
gene categories (Figure 3) (46). The firstdescribes a linear relationship that is often
found for structural and regulatory proteins(Figure 3a); the second fits a diminish-
ing returns principle typical of enzymesthat function at limiting concentrations
(Figure 3b). Consistent with this, disorderscaused by genes encoding enzymes are pri-
marily recessive, indicating that these genes
are not dosage sensitive (haplosufficient)
www.annualreviews.org Gene Duplication 23
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
8/21
Protein concentration(genotype)
Fitness
(phenotype)
0 0.5 1 2
(aa) (aA) (AA) (AA,AA)
Protein concentration(genotype)
Fitness
(phenotype)
0 0.5 1 2
(aa) (aA) (AA) (AA,AA)
Protein concentration(genotype)
Fitn
ess
(phen
otype)
0 0.5 1 2
(aa) (aA) (AA) (AA,AA)
Protein concentration(genotype)
Fitn
ess
(phen
otype)
0 0.5 1 2
(aa) (aA) (AA) (AA,AA)
Examples
Examples
a
c
Linear function
Stoichiometric titration
Structural-,regulatory-
proteins
Examples
Proteins with pro-
pensity to aggregate(Parkinson/Alzheimerdisease)(see Table 1)
d Aggregation
Examples
b Diminishing returns function
Enzymes(see Table 1)
Dosage-dependent
transcription factors(see Table 1)
Figure 3
Schematic representations of gene dosage and fitness (phenotype). Four different relationships areshown. (a) A linear relationship is typically found for structural and regulatory proteins; (b) a diminishingreturns function classically involves enzymes that show little variation in function over large dose ranges.(c) Certain functional gene classes enriched for transcriptional regulators and signaling molecules causephenotypes both when under- and overexpressed (haploinsufficiency and pathogenic gene duplication).(d) Disease phenotypes caused by protein aggregation. The wild-type protein aggregates at doses beyondthe threshold level of two gene copies. Heterozygous mutations can also cause protein aggregation at onegene copy. Uppercase A depicts the normal allele; lowercase a represents a mutant loss of functionallele.
Haploinsufficiency:a dominantphenotype in adiploid organismthat is heterozygousfor a loss of functionallele
(44). The third alternative corresponds
to a diminished fitness for both increasedand decreased gene dosage, indicating ei-
ther multisubunit complexes with a singlecomponent that has a tight stoichiometry
(46) (gene balance hypothesis), or specificregulatory imbalances as a consequence of
under- (insufficient amount hypothesis) andoverexpression (25) (Figure 3c). Transcrip-
tional regulators can cause phenotypes both
when under- and overexpressed (84), a mech-anism responsible for many developmental/
malformation syndromes (Figure 3c) (Ta-
ble 1). A fourth alternative concerns proteinswith a propensity to aggregate (Figure 3d)
These proteins display a dual behaviorThe wild-type protein aggregates in a
dose-dependent manner once the diploidthreshold dose is exceeded (e.g., gene du-
plication). Alternatively, aggregation occursin the diploid state in the presence of a
dominant mutation (Figure 3d) (49). Genesencoding such proteins include -synuclein
(SNCA), responsible for one rare familial
24 Conrad Antonarakis
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
9/21
form of early-onset Parkinson disease, and
the amyloid precursor protein (APP) thatcauses one form of dominant early-onset
Alzheimer disease (38, 93).
Synopsis of diseases and mechanisms asso-
ciated with duplicated genes. In Table 1,
we summarized the main features of pheno-types and diseases caused by duplicated genes.
Examples of pathogenic duplications in-
volving dosage-sensitive genes. The gen-
eral models discussed predict that among thegenes overexpressed as a result of duplication,
a minor fraction of dosage-sensitive genesenriched for transcription factors, signaling
molecules, and cell cycle-regulated genes willbe critical for the phenotypic features (96).
Competing models stipulate that over- andunderexpression phenotypes of such genes areeither similar (gene balance hypothesis), or al-
ternatively distinct (insufficient amounts hy-pothesis). In Charcot-Marie-Tooth (CMT)
disease, the most frequent inherited disor-der of the peripheral nervous system (55,
65), a 1.4-Mb tandem duplication is foundin 70% of autosomal dominant CMT1A
patients. Although the 1.4-MB interval con-tains 30 genes,PMP22,encoding the major
myelin protein, is responsible for the disorderthrough a gene dosage effect. In the heterozy-
gous duplication state,PMP22trisomy causesa nerve conduction deficit due to demyliniza-
tion (CMT1A), whereasPMP22 monosomy
(deletion) causes a nerve conduction block[hereditary neuropathy with liability to pres-
sure palsies (HNPP)] in its haploinsufficientstate. Because many central and peripheral
demyelinating disorders, such as CMT1A,Pelizaeus-Merzbacher disease (PMD) (27),
and autosomal dominant leukodystrophy(ADLD) (75), can be caused by gene duplica-
tion, it was suggested that myelin formationis particularly susceptible to gene dosage ef-
fects (75). The CMT/HNPP-causing regionshows how two distinct phenotypes can result
from over- and underexpression of a critical
dosage-sensitive gene. Similarly, deletion and
duplication of the Sotos syndrome region and
NSD1gene dosage effects cause contrastingphenotypes (14).
Over- and underexpression phenotypescan also be similar, as shown by the methyl-
CpG-binding protein 2 (MECP2), a chro-matin architectural protein linked to tran-
scriptional repression (52). The progressiveneurodegenerative disorder Rett syndrome,
which in its classical form affects females
almost exclusively, is due to MECP2 hap-loinsufficiency. Severe mental retardation and
neurological symptoms with features of Rettsyndrome in males can also be caused by
MECP2duplications (101). Likewise, abnor-mal neurodevelopmental phenotypes linked
to both MECP2 under- and overexpressionwere described in human and transgenic
mouse models (53). Similarly, X-linked hy-popituitarism can be caused by inactivating
heterozygous SOX3 mutations, or by duplica-tion of a region containing the developmental
transcription factorSOX3(Table 1) (107).
Pathogenic duplications also occur in re-gions of genomic microdeletions, such as
the velocardiofacial, the Williams-Beuren,the Alagille, and the Smith-Magenis syn-
drome regions (Table 1) (64, 79, 95, 109).These examples partially fit the model
that duplication of one or a few dosage-sensitive genes causes overexpression pheno-
types that resemble the underexpression phe-notype (Figure 3c). On the other hand, the
duplication at 1p36 matches an incremental/linear gene-phenotype model for closure of
cranial sutures (Figure 3ab) (32). Hap-
loinsufficiency of genes in this region re-sult in delayed closure of cranial sutures,
whereas increased gene dosage via duplica-tion, for instance, results in craniosynosto-
sis. A heterogenous group of duplications,spanning 0.51-Mb regions, causes pheno-
types (a) for which the corresponding mono-somy has either not yet been reported (split
hand/split foot syndrome, SHFM3) (21);(b) for which duplication and haploinsuffi-
ciency of as-yet-unidentified genes have beenimplicated in a similar/identical phenotype
www.annualreviews.org Gene Duplication 25
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
10/21
Table 1 Duplication phenotypes
Species GeneGene
categoryGenomicalteration Disease Phenotype Mechanism References
Similar under- and overexpression phenotypes
Hs/Mm MECP2 MethylatedDNAbinding
DuplicationMECP2
Progressive neu-rodevelopmentalsyndrome inmales
Mentalretardation,epilepsy
Loss of genefunction due tounder- andoverexpression
(52, 98)
Hs SOX3 Developmentalregulator(TF)
DuplicationSOX3
X-linkedhypopituitarism(XLHP)
X-linkedhypopituitarismand infundibularhypoplasia
Loss of genefunction due tounder- andoverexpression
(104)
Hs ND (TBX1) ND Duplication22q11.2
Velocardiofacialsyndrome(VCFS)
Variable: normal todevelopmentaldelay andmalformations
Loss of genefunction due tounder- andoverexpression
(106)
Hs ND (ELN) ND Duplication7q11.23
Williams Beurensyndrome(WBS)
Delay expressivelanguage
Loss of genefunction due tounder- andoverexpression
(92)
Hs ND
( JAGGED1)
Developmental
regulator(TF)
Duplication
20p11
Alagille syndrome
(AS)
Cardiovascular-,
ocular-, bileduct-, andskeletal anomalies
Loss of gene
function due tounder- andoverexpression
(63)
Hs RAI1 Developmentalregulator(TF)
Duplication17p11.2
Smith-Magenissyndrome (SMS)
Mild mentalretardation anddentalabnormalities
Loss of genefunction due tounder- andoverexpression
(77)
Hs PLP1 Proteolipidprotein
DuplicationPLP1
Pelizaeus-Merzbacher(PM)
Demyelinationdisorder CNS
Loss of genefunction due tounder- andoverexpression
(26)
Dissimilar under- and overexpression phenotypes
Hs PMP22 Myelinprotein
DuplicationPMP22
Charcot MarieTooth 1A
(CMT1A)
Peripheral myelinneuropathy
Loss of genefunction due to
under- andoverexpression
(54, 64)
Hs NSD1 Histonemethyltrans-ferase
DuplicationNSD1
Growthretardationsyndrome
Growthretardation
Loss of genefunction due tounder- andoverexpression
(63)
Hs ND(MMP23)
ND Duplication1p36
Premature closurecranial sutures
Craniosynostosis Incremental genefunction
(31)
Complex, yet unresolved expression phenotypes
Hs LMB1 Laminarnuclearenvelopeprotein
DuplicationLMB1
Autosomaldominantleukodystrophy(ADLD)
Demyelinationdisorder CNS
ND (73)
Hs ND ND Duplication10q24
Split hand/splitfootmalformation 3(SHFM3)
Split hand/splitfoot
ND (21)
Hs ND ND Duplication2q13
Orofacialclefting/cleftpalate only
Mental retardationand orofacialclefting
ND (72)
(Continued)
26 Conrad Antonarakis
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
11/21
Table 1 (Continued)
Species GeneGene
categoryGenomicalteration Disease Phenotype Mechanism References
Hs ND ND Duplication16p13
ATR-X-like X-linked -thalassemia/mentalretardation
ND (2)
Protein aggregation due to overexpression
Hs SNCA Molecular
chaperone
Duplication
SNCA
Parkinson disease Nigrostriatal
neurondegeneration
Protein
aggregation andunderlying genemutations
(91)
Hs APP Amyloidprecursorprotein
DuplicationAPP
Alzheimer disease Parenchymal/vascularamyloiddeposition
Proteinaggregation andunderlying genemutations
(37, 79)
Phenotypes of CNVs related to environment and immunity
Hs CYP2D6 P-450isoenzyme
P-450CNV Altered drugmetabolism
Adverse drug effects Incremental/lineargene functionmodel
(12, 40, 75)
Hs CCL3L1 Chemokinereceptor
CCL3L1CNV Altered HIVsusceptibility
EnhancedHIV/AIDS
susceptibility
Incremental/lineargene function
model
(33)
Common complex phenotypes of CNVs in defense-related genes
Hs/Rn FCGR3B/Fcgr3-rs
Fc receptorfor IgG
FCGR3CNV Glomerulonephritis/systemic lupuserymathosus
Susceptibility toglomerulonephri-tis
Pathogenic un-derexpressionphenotype
(1)
Mm TLR7 Toll-likereceptor
TLR7CNV Systemic lupuserymathosus-likedisease
Autoantibody-elictedautoimmunity
Pathogenicoverexpressionphenotype
(76)
Hs hBD2 Antimicrobialpeptides
hBD2CNV Crohns disease ofthe colon
Inflammatory boweldisease
Pathogenic un-derexpressionphenotype
(28)
(mental retardation/orofacial clefting syn-
drome) (74); and (c) for which haploinsuf-ficiency of a transcription factor (ATRX at
Xq13.3) and duplication of a second regioncontaining candidateATRXtarget genes (re-
gion at 16p13.11-16p13.3 comprising the -
globin gene) result in a similar clinical outcome
(X-linked -thalassemia/mental retardationsyndrome) (2). Although these examples are
more complex than those previously dis-cussed, the last two cases may be compatible
with similar under- and overexpression phen-toypes.
Diseases caused by protein aggregation.
Protein aggregation is particularly associatedwith disease genes (106). Two rare forms of
neurodegenerative disorders, Parkinson and
Alzheimer disease, provide constructive ex-
amples of pathogenic gene dosage effects me-diated via protein aggregation (Figure 3d)
(38, 93). Alpha-synuclein (SNCA) duplicationandtriplicationlead to increased expression of
-synuclein, a small protein thought to playa role as a molecular chaperone in vesicular
transport and/or turnover of synaptic vesicles.The pathology induced by this protein and the
severity of disease both depend on the geneexpression level. Although the mechanism is
not yet fully understood,-synuclein aggre-gation, controlled by mutations in at least
three genes includingSNCAitself, is thought
to promote nigrostriatal neurons for degen-eration (94).
The amyloid precursor protein (APP)can be cleaved into either of two smaller
www.annualreviews.org Gene Duplication 27
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
12/21
peptides, A40 and A42 (38). It is accumu-
lation of the latter causing parenchymal andvascular deposition that is probably responsi-
ble for Alzheimer pathogenesis. The diseasephenotype can be caused by a varietyof mech-
anisms that lead to an increase in APP pro-tein aggregates, namely duplications ofAPP
on chromosome 21 (82), Down syndrome (tri-somy 21) with three APPcopies, or, alterna-tively, mutations inAPPitself or in the gene
encoding the proteases responsible for APPcleavage (38). Similar mechanisms may ac-
count for other neurological diseases that arebased on protein deposition (38).
Modified responses to drugs. Genes en-
coding enzymes are not usually dosage sen-sitive, yet there are examples of such gene
duplications that cause variable pheno-types, particularly those encoding protein
complexes that regulate drug metabolism
(12). The three cytochrome P-450 isoformsCYP2C9, CYP2C19, and CYP2D6, which all
vary in copy number, are responsible for thebiotransformation of about 40% of all drugs
that are metabolized by P-450. Copy num-ber variation among the 80 distinct alleles
of theCYP2D6gene partially defines the in-dividual capacity to metabolize drugs. Four
major phenotypic categories of drug oxida-tion have been recognized to date, namely
poor, intermediate, extensive, and ultrarapidmetabolizers. Because these phenotypic dif-
ferences have clinical consequences, such as
adverse drug reactions or therapeutic failure(77), it will be important to develop accurately
predictive genotyping for these and other en-zymes that appear to display clinically relevant
CNVs (41). CNVs are currently under inves-tigation for their pharmacogenetic relevance
(73).
Altered responses to pathogens. Becausegenes encoding proteins in metabolic path-
ways in unicellular organisms have a higherduplicability rate than other gene categories
(61), this could generally be true for genes
responding to and/or regulated by environ-
mental stimuli. Intriguingly, copy number
variation of the gene encoding CCL3L1a potent human immunodeficiency virus-1
(HIV-1)-suppressive chemokine, influencesthe susceptibilityto HIV-1 infection (34).One
could speculate that CNVs containing genesrelated to immune responses may be involved
in the control of infections. Of relevance inthis regard are CNVs of the novel defensin
gene family that function as chemotactic
activators of the immune response (3, 86). Asdiscussed below, defensin CNVs may cause
inflammatory bowel disease by modulatingthe host response to pathogens (29). Gene
families that constitute the core elementscontrolling infections, namely the T-cel
receptor and immunoglobulin genes, evolvedvia species-specific sequential rounds of dupli-
cations (98). Copy number variation in thesegene families may also play a role in complex
phenotypes such as autoimmune diseases.
Role of copy number variants in complex
diseases. An example of the possible roleof CNVs in common complex phenotypes
is the demonstration that low copy numbersof theFCGR3Bgene encoding the activatory
Fc receptor for IgG predispose patients withsystemic lupus erythematosus to an inflamma-
tory disease of the kidneys (glomerulonephri-tis) (1). Analogous findings in rats corroborate
this observation. Additional examples includeduplication ofTLR7, an innate immune re-
ceptor that predisposes mice carrying it to au-toreactive B-cell responses (78), and low copy
number of the human beta-defensin 2 (hBD2)
gene that predisposes to Crohns disease ofthe colon (29). The importance of CNVs for
complex phenotypes and diseases in genes re-lated to immune defense should therefore be
generally explored.
Disease-causing gene conversion medi-
ated by duplicated pseudogenes. Among
the 1945 nonprocessed pseudogenes locatednear the ancestral gene (out of a total of
3426 nonprocessed pseudogenes), 11 casesof deleterious gene conversion induced by
28 Conrad Antonarakis
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
13/21
the pseudogene were recognized (11). Among
those is the retinitis pigmentosa9 pseudogene(RP9P) carrying a mutation that produces
a nonsynonymous substitution in the wild-type gene associated with the RP9 form
of autosomal dominant retinitis pigmentosa(ADRP). Two other processed pseudogenes
also contain mutations associated with dis-eases, an inosine monophosphate dehydroge-nase 1 pseudogene (IMPDH1P1) that causes
the RP10 form of ADRP, and a phosphoglyc-erate kinase 1 pseudogene (PGK1P1) associ-
ated with phosphoglycerate kinase deficiency.These observations highlight the pathogenic
potential of duplicate gene copies that ac-quired inactivating mutations (pseudogenes)
for the wild-type progenitor gene locatednearby.
Partial phenotype rescue by a duplicated
gene. The second most frequent autosomalrecessive disease in Europeans, spinal muscu-
lar atrophy (SMA), adds an important lessonto the phenotypic consequences of gene du-
plication. The disease, which is usually due
to homozygous mutations in the survival mo-tor neuron 1 gene (SMN1), can be partially
rescued by increasing copy numbers of itsnearly identical and partially functional dupli-
cate SMN2 (28). This example emphasizesthepartial functional redundancy of some dupli-
cate genes (40), and supports the notion thatdeletion of a duplicate gene results in a less
severe phenotype than deletion of a singletongene.
CONCLUSIONS ANDPERSPECTIVES
The frequency and variety of phenotypes in-
duced by gene duplication are diverse, andhave not yet been fully appreciated. Several
fields deserve more investigation, notably therole of gene duplication in (a) monogenic
phenotypes, [particularly disorders implicat-ing haploinsufficient dosage-sensitive genes,
for instance congenital heart disease (71)];(b) polygenic complex multifactorial pheno-
types, as exemplified by the implication of the
FCGR3and hBD2copy number in glomeru-
lonephritis and Crohns disease, and by therole ofTLR7duplications in autoimmunity
elicited by autoantibodies; (c) drug, host,
pathogen, and metabolic responses; (d) dis-eases due to protein aggregation; and (e) gene
regulation, as exemplified by the contributionof microRNA gene duplication to complex
patterns of gene regulation (59), and by du-plication of conserved noncoding regions that
helped diversify expression of developmen-tal regulators (63). Finally, because increased
dosage of genes related to olfaction, immu-nity, and protein secretion may have been
positively selected in humans (69), a possiblymoregeneral contribution to phenotypicvari-
ation of CNVs in genes responding to envi-
ronmental stimuli deserves a more thoroughexploration in the near future. In summary,
this review article discusses and emphasizesthat gene duplication plays an important role
in genome evolution and phenotypic variabil-ity, and can cause disease phenotypes.
ACKNOWLEDGMENTS
We thank Drs. Jacques Beckmann, Samuel Deutsch, and Henrik Kaessmann for reading the
manuscript. B.C. is supported by the SNSF and Helmut Horten Foundation; S.E.A. is sup-
ported by the SNSF, EU, NIH, and Childcare Foundation, and by funds from the Universityof Geneva.
LITERATURE CITED
1. Aitman TJ, Dong R, Vyse TJ, Norsworthy PJ, Johnson MD, et al. 2006. Copy numberpolymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans.Nature
439:85155
www.annualreviews.org Gene Duplication 29
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
14/21
2. Akahoshi K, Ohashi H, Hattori Y, Saitoh S, Fukushima Y, Wada T. 2005. A woman
with 46,XX,dup(16)(p13.11 p13.3) and the ATR-X phenotype. Am. J. Med. Genet. A132:41418
3. Aldred PM, Hollox EJ, Armour JA. 2005. Copy number polymorphism and expressionlevel variation of the human alpha-defensin genes DEFA1 and DEFA3.Hum. Mol. Genet14:204552
4. ArmengolL,PujanaMA,CheungJ,SchererSW,EstivillX.2003.Enrichmentofsegmen-
tal duplications in regions of breaks of synteny between the human and mouse genomessuggest their involvement in evolutionary rearrangements. Hum. Mol. Genet.12:220185. Bailey JA, Baertsch R, Kent WJ, Haussler D, Eichler EE. 2004. Hotspots of mammalian
chromosomal evolution.Genome Biol.5:R236. Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, et al. 2002. Recent segmental
duplications in the human genome.Science297:100377. Bailey JA, Liu G, Eichler EE. 2003. An Alu transposition model for the origin and
expansion of human segmental duplications.Am. J. Hum. Genet.73:823348. Berg J, Lassig M, Wagner A. 2004. Structure and evolution of protein interaction net-
works: a statistical model for link dynamics and gene duplications.BMC Evol. Biol.4:519. Birchler JA, Bhadra U, Bhadra MP, Auger DL. 2001. Dosage-dependent gene regulation
in multicellular eukaryotes: implications for dosage compensation, aneuploid syndromesand quantitative traits.Dev. Biol.234:27588
10. Birchler JA, Riddle NC, Auger DL, Veitia RA. 2005. Dosage balance in gene regulation
biological implications.Trends Genet.21:2192611. Bischof JM, Chiang AP, Scheetz TE, Stone EM, Casavant TL, et al. 2006. Genome-wide
identification of pseudogenes capable of disease-causing gene conversion. Hum. Mutat27:54552
12. Caraco Y. 2004. Genes and the response to drugs.N. Engl. J. Med.351:28676913. Casneuf T, De Bodt S, Raes J, Maere S, Van de Peer Y. 2006. Nonrandom divergence of
gene expressionfollowing gene andgenomeduplicationsin theflowering plantArabidopsis
thaliana.Genome Biol.7:R13
14. Chen CP, Lin SP, Lin CC, Chen YJ, Chern SR, et al. 2006. Molecular cytogeneticanalysis of de novo dup(5)(q35.2q35.3) and review of the literature of pure partial trisomy
5q.Am. J. Med. Genet. A140:159460015. Cheng Z, Ventura M, She X, Khaitovich P, Graves T, et al. 2005. A genome-wide com-
parison of recent chimpanzee and human segmental duplications. Nature437:8893
16. Cheung VG, Nowak N, Jang W, Kirsch IR, Zhao S, et al. 2001. Integration of cytogeneticlandmarks into the draft sequence of the human genome.Nature409:95358
17. Chung WY, Albert R, Albert I, Nekrutenko A, Makova KD. 2006. Rapid and asym-metric divergence of duplicate genes in the human gene coexpression network. BMC
Bioinformatics7:4618. Conant GC, Wagner A. 2002. GenomeHistory: a software tool and its application to fully
sequenced genomes.Nucleic Acids Res.30:33788619. Davis JC, Petrov DA. 2004. Preferential duplication of conserved proteins in eukaryotic
genomes.PLOS Biol.2:E5520. Davis JC, Petrov DA. 2005. Do disparate mechanisms of duplication add similar genes
to the genome?Trends Genet.21:5485121. de Mollerat X, Gurrieri F, Morgan CT, Sangiorgi E, Everman DB, et al. 2003. A genomic
rearrangement resulting in a tandem duplication is associated with split hand-split foot
malformation 3 (SHFM3) at 10q24. Hum. Mol. Genet.12:195971
30 Conrad Antonarakis
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
15/21
22. Dehal P, Boore JL. 2005. Two rounds of whole genome duplication in the ancestral
vertebrate.PLOS Biol.3:e31423. Dermuth JP, De Bie T, Stjich JE, Cristianini N, Hahn MW. 2006.PLoS1:e85
24. Derti A, Roth FP, Church GM, Wu CT. 2006. Mammalian ultraconserved elements arestrongly depleted among segmental duplications and copy number variants.Nat. Genet.38:121620
25. Deutschbauer AM, Jaramillo DF, Proctor M, Kumm J, Hillenmeyer ME, et al. 2005.
Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast.Genetics169:19152526. Eichler EE. 2001. Recent duplication, domain accretion and the dynamic mutation of
the human genome.Trends Genet.17:6616927. Ellis D, Malcolm S. 1994. Proteolipid proteingene dosageeffect in Pelizaeus-Merzbacher
disease.Nat. Genet.6:3333428. Feldkotter M, Schwarzer V, Wirth R, Wienker TF, Wirth B. 2002. Quantitative analyses
of SMN1 and SMN2 based on real-time lightCycler PCR: fast and highly reliable carriertesting and prediction of severity of spinal muscular atrophy.Am. J. Hum. Genet.70:358
6829. Fellermann K, Stange DE, Schaeffeler E, Schmalzl H, WehkampJ, et al. 2006. A chromo-
some 8 gene-cluster polymorphism with low human Beta-defensin 2 gene copy numberpredisposes to crohn disease of the colon. Am. J. Hum. Genet.79:43948
30. Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. 1999. Preservation of
duplicate genes by complementary, degenerative mutations. Genetics151:15314531. Freeling M, Thomas BC. 2006. Gene-balanced duplications, like tetraploidy, provide
predictable drive to increase morphological complexity.Genome Res.16:8051432. Gajecka M, Yu W, Ballif BC, Glotzbach CD, Bailey KA, et al. 2005. Delineation of
mechanisms and regions of dosage imbalance in complex rearrangements of 1p36 leadsto a putative gene for regulation of cranial suture closure.Eur. J. Hum. Genet.13:13949
33. Gimelbrant AA, Chess A. 2006. An epigenetic state associated with areas of gene dupli-cation.Genome Res.16:72329
34. Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, et al. 2005. The influence ofCCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility.Science307:143440
35. Gu X, Wang Y, Gu J. 2002. Age distribution of human gene families shows significant
roles of both large- and small-scale duplications in vertebrate evolution. Nat. Genet.
31:205936. Gu X, Zhang Z, Huang W. 2005. Rapid evolution of expression and regulatory diver-
gences after yeast gene duplication.Proc. Natl. Acad. Sci. USA102:7071237. Gu Z, Rifkin SA, White KP, Li WH. 2004. Duplicate genes increase gene expression
diversity within and between species.Nat. Genet.36:5777938. Hardy J. 2006. Amyloid double trouble. Nat. Genet.38:1112
39. He X, Zhang J. 2005. Gene complexity and gene duplicability. Curr. Biol.15:10162140. He X, Zhang J. 2005. Transcriptional reprogramming and backup between duplicate
genes: Is it a genome-wide phenomenon? Genetics172:13636741. Hildebrandt MA, Salavaggione OE, Martin YN, Flynn HC, Jalal S, et al. 2004. Human
SULT1A3 pharmacogenetics: gene duplication and functional genomic studies.Biochem.
Biophys. Res. Commun.321:87078
42. Huminiecki L, Wolfe KH. 2004. Divergence of spatial gene expression profiles following
species-specific gene duplications in human and mouse.Genome Res.14:187079
www.annualreviews.org Gene Duplication 31
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
16/21
43. Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, et al. 2004. Detection of
large-scale variation in the human genome. Nat. Genet.36:9495144. Jimenez-Sanchez G, Childs B, Valle D. 2001. Human disease genes.Nature409:8535545. Jordan IK, Wolf YI, Koonin EV. 2004. Duplicated genes evolve slower than singletons
despite the initial rate increase. BMC Evol. Biol.4:2246. Kondrashov FA, Koonin EV. 2004. A common framework for understanding the origin of
genetic dominance and evolutionary fates of gene duplications.Trends Genet.20:2879047. Kondrashov FA, Rogozin IB, Wolf YI, Koonin EV. 2002. Selection in the evolution of
gene duplications.Genome Biol.3: RESEARCH000848. Kopelman NM, Lancet D, Yanai I. 2005. Alternative splicing and gene duplication are
inversely correlated evolutionary mechanisms.Nat. Genet.37:5888949. Li J, Uversky VN, Fink AL. 2001. Effect of familial Parkinsons disease point mutations
A30P and A53T on the structural properties, aggregation, and fibrillation of human
alpha-synuclein.Biochemistry40:116041350. Li WH, Gu Z, Cavalcanti AR, Nekrutenko A. 2003. Detection of gene duplications and
block duplications in eukaryotic genomes.J. Struct. Funct. Genom.3:273451. Linardopoulou EV, Williams EM,Fan Y, Friedman C, Young JM,Trask BJ.2005.Human
subtelomeres are hot spots of interchromosomal recombination and segmental duplica-tion.Nature437:94100
52. Luger K, Hansen JC. 2005. Nucleosome andchromatin fiber dynamics. Curr. Opin. StructBiol.15:18896
53. Luikenhuis S, Giacometti E, Beard CF, Jaenisch R. 2004. Expression of MeCP2 in post-
mitotic neurons rescues Rett syndrome in mice.Proc. Natl. Acad. Sci. USA101:60333854. Lupski JR, Stankiewicz P. 2005. Genomic disorders: molecular mechanisms for rear-
rangements and conveyed phenotypes.PLOS Genet.1:e4955. Lupski JR, Wise CA, Kuwano A, Pentao L, Parke JT, et al. 1992. Gene dosage is a
mechanism for Charcot-Marie-Tooth disease type 1A. Nat. Genet.1:293356. Lynch M, Conery JS. 2000. The evolutionary fate and consequences of duplicate genes
Science290:11515557. Lynch M, Force A. 2000. The probability of duplicate gene preservation by subfunction-
alization.Genetics154:4597358. Maere S, De Bodt S, Raes J, Casneuf T, Van Montagu M, et al. 2005. Modeling gene and
genome duplications in eukaryotes.Proc. Natl. Acad. Sci. USA102:54545959. Maher C, Stein L, Ware D. 2006. Evolution ofArabidopsismicroRNA families through
duplication events.Genome Res.16:5101960. Makova KD, Li WH. 2003. Divergence in the spatial pattern of gene expression between
human duplicate genes.Genome Res.13:16384561. Marland E, Prachumwat A, Maltsev N, Gu Z, Li WH. 2004. Higher gene duplicabilities
for metabolic proteins than for nonmetabolic proteins in yeast and E.coli.J. Mol. Evol59:80614
62. Maslov S, Sneppen K, Eriksen KA, Yan KK. 2004. Upstream plasticity and downstream
robustness in evolution of molecular networks.BMC Evol. Biol.4:963. McEwen GK, Woolfe A, Goode D, Vavouri T, Callaway H, Elgar G. 2006. Ancient du-plicated conserved noncoding elements in vertebrates: a genomic and functional analysis
Genome Res.16:4516564. Moog U, Engelen J, Albrechts J, Hoorntje T, Hendrikse F, Schrander-Stumpel C. 1996
Alagille syndrome in a family with duplication 20p11.Clin. Dysmorphol.5:2798865. Murakami T, Garcia CA, Reiter LT, Lupski JR. 1996. Charcot-Marie-Tooth disease and
related inherited neuropathies.Medicine (Baltimore)75:23350
32 Conrad Antonarakis
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
17/21
66. Murphy WJ, Larkin DM, Everts-van der Wind A, Bourque G, Tesler G, et al. 2005.
Dynamics of mammalian chromosome evolution inferred from multispecies comparativemaps.Science309:61317
67. Nei M, Rogozin IB, Piontkivska H. 2000. Purifying selection and birth-and-death evo-
lution in the ubiquitin gene family.Proc. Natl. Acad. Sci. USA97:108667168. NeiM, RooneyAP. 2005. Concertedandbirth-and-death evolution of multigenefamilies.
Annu. Rev. Genet. 39:1215269. Nguyen DQ, Webber C, Ponting CP. 2006. Bias of selection on human copy-number
variants.PLOS Genet.2:e2070. Ohno S. 1970.Evolution by Gene Duplication. New York: Springer Verlag71. Olson EN. 2006. Gene regulatory networks in the evolution and development of the
heart.Science313:19222772. Olson MV. 1999. When less is more: gene loss as an engine of evolutionary change. Am.
J. Hum. Genet. 64:182373. Ouahchi K, Lindeman N, Lee C. 2006. Copy number variants and pharmacogenomics.
Pharmacogenomics7:252974. Ounap K, Ilus T, Laidre P, Uibo O, Tammur P, Bartsch O. 2005. A new case of 2q
duplication supports either a locus for orofacial clefting between markers D2S1897 andD2S2023 or a locus for cleft palate only on chromosome 2q13-q21.Am. J. Med. Genet.
A137:3232775. Padiath QS, Saigoh K, Schiffmann R, Asahara H, Yamada T, et al. 2006. Lamin B1
duplications cause autosomal dominant leukodystrophy. Nat. Genet.38:11142376. Papp B, Pal C, Hurst LD. 2003. Dosage sensitivity and the evolution of gene families in
yeast.Nature424:1949777. Phillips KA, Veenstra DL, Oren E, Lee JK, Sadee W. 2001. Potential role of pharma-
cogenomics in reducing adverse drug reactions: a systematic review. JAMA286:22707978. Pisitkun P, Deane JA, Difilippantonio MJ, Tarasenko T, Satterthwaite AB, Bolland S.
2006. Autoreactive B cell responses to RNA-related antigens due to TLR7 gene dupli-cation.Science312:166972
79. Potocki L, Chen KS, Park SS, Osterholm DE, Withers MA, et al. 2000. Molecular
mechanism for duplication 17p11.2the homologous recombination reciprocal of theSmith-Magenis microdeletion.Nat. Genet.24:8487
80. Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, et al. 2006. Gloval variation in copynumber in the human genome.Nature444:44454
81. Roper RJ, Reeves RH. 2006. Understanding the basis for Down syndrome phenotypes.
PLOS Genet.2:e5082. Rovelet-Lecrux A, Hannequin D, Raux G, Meur NL, Laquerriere A, et al. 2006. APP
locus duplication causes autosomal dominant early-onset Alzheimer disease with cerebralamyloid angiopathy.Nat. Genet.38:2426
83. Samonte RV, Eichler EE. 2002. Segmental duplications and the evolution of the primate
genome.Nat. Rev. Genet.3:6572
84. Schedl A, Ross A, Lee M, Engelkamp D, Rashbass P, et al. 1996. Influence of PAX6 genedosage on development: overexpression causes severe eye abnormalities. Cell86:718285. Sebat J, Lakshmi B, Troge J, Alexander J, Young J, et al. 2004. Large-scale copy number
polymorphism in the human genome.Science305:5252886. Semple CA, Gautier P, Taylor K, Dorin JR. 2006. The changing of the guard: Molecular
diversity and rapid evolution of beta-defensins. Mol. Divers. 10:5758487. Sen SK, Han K, Wang J, Lee J, Wang H, et al. 2006. Human genomic deletions mediated
by recombination between Alu elements.Am. J. Hum. Genet.79:4153
www.annualreviews.org Gene Duplication 33
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
18/21
88. Sharp AJ, Locke DP, McGrath SD, Cheng Z, Bailey JA, et al. 2005. Segmental duplica-
tions and copy-number variation in the human genome. Am. J. Hum. Genet.77:788889. Shaw CJ, Lupski JR. 2004. Implications of human genome architecture for
rearrangement-based disorders: the genomic basis of disease. Hum. Mol. Genet.13 SpecNo 1:R57-64
90. She X, Horvath JE, Jiang Z, Liu G, Furey TS, et al. 2004. The structure and evolutionof centromeric transition regions within the human genome. Nature430:85764
91. Shiu SH, Byrnes JK, Pan R, Zhang P, Li WH. 2006. Role of positive selection in theretention of duplicate genes in mammalian genomes.Proc. Natl. Acad. Sci. USA 103:223236
92. Shoja V, Zhang L. 2006. A roadmap of tandemly arrayed genes in the genomes of humanmouse, and rat.Mol. Biol. Evol.23:213441
93. Singleton A, Gwinn-Hardy K. 2004. Parkinsons disease and dementia with Lewy bodiesa difference in dose?Lancet364:11057
94. Singleton AB. 2005. Altered alpha-synuclein homeostasis causing Parkinsons disease: thepotential roles of dardarin.Trends Neurosci.28:41621
95. SomervilleMJ,MervisCB,YoungEJ,SeoEJ,delCampoM,etal.2005.Severeexpressive-language delay related to duplication of the Williams-Beuren locus. N. Engl. J. Med.
353:169470196. Sopko R, Huang D, Preston N, Chua G, Papp B, et al. 2006. Mapping pathways and
phenotypes by systematic gene overexpression.Mol. Cell21:31930
97. Stankiewicz P, Lupski JR. 2002. Genome architecture, rearrangements and genomicdisorders.Trends Genet.18:7482
98. Su C, Nei M. 2001. Evolutionary dynamics of the T-cell receptor VB gene family asinferred from the human and mouse genomic sequences. Mol. Biol. Evol.18:50313
99. Su Z, Wang J, Yu J, Huang X, Gu X. 2006. Evolution of alternative splicing after geneduplication.Genome Res.16:18289
100. Van de Peer Y. 2004. Computational approaches to unveiling ancient genome duplica-tions.Nat. Rev. Genet.5:75263
101. Van Esch H, Bauters M, Ignatius J, Jansen M, Raynaud M, et al. 2005. Duplicationof the MECP2 region is a frequent cause of severe mental retardation and progressive
neurological symptoms in males.Am. J. Hum. Genet.77:44253102. Veitia RA. 2002. Exploring the etiology of haploinsufficiency.Bioessays24:17584
103. Vinckenbosch N, Dupanloup I, Kaessmann H. 2006. Evolutionary fate of retroposed
gene copies in the human genome. Proc. Natl. Acad. Sci. USA103:322025104. Wagner A. 2000. Decoupled evolution of coding region and mRNA expression patterns
after gene duplication: implications for the neutralist-selectionist debate.Proc. Natl. AcadSci. USA97:657984
105. Wang X, Grus WE, Zhang J. 2006. Gene losses during human rrigins.PLOS Biol.4:e52106. Wong P, Fritz A, Frishman D. 2005. Designability, aggregation propensity and duplica-
tion of disease-associated proteins.Protein Eng. Des. Sel.18:5038107. Woods KS, Cundall M, Turton J, Rizotti K, Mehta A, et al. 2005. Over- and underdosage
of SOX3 is associated with infundibular hypoplasia and hypopituitarism.Am. J. Hum
Genet.76:83349
108. Xue Y, Daly A, Yngvadottir B, Liu M, Coop G, et al. 2006. Spread of an inactive form ocaspase-12 in humans is due to recent positive selection. Am. J. Hum. Genet.78:65970
109. Yobb TM, Somerville MJ, Willatt L, Firth HV, Harrison K, et al. 2005. Microduplication
and triplication of 22q11.2: a highly variable syndrome. Am. J. Hum. Genet.76:86576
34 Conrad Antonarakis
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
19/21
110. Zhang L, Lu HH, Chung WY, Yang J, Li WH. 2005. Patterns of segmental duplication
in the human genome.Mol. Biol. Evol.22:13541111. Zhang Z, Gu J, Gu X. 2004. Howmuch expression divergence after yeast gene duplication
could be explained by regulatory motif evolution?Trends Genet.20:4037112. Zhang Z, Luo ZW, Kishino H, Kearsey MJ. 2005. Divergence pattern of duplicate genes
in protein-protein interactions follows the power law.Mol. Biol. Evol.22:5015113. Zody MC, Garber M, Sharpe T, Young SK, Rowen L, et al. 2006. Analysis of the DNA
sequence and duplication history of human chromosome 15. Nature440:67175114. Zody MC, Garber M, Adams DJ, Sharpe T, Harrow J, et al. 2006. DNA sequence ofhuman chromosome 17 and analysis of rearrangement in the human lineage. Nature
440:104549
www.annualreviews.org Gene Duplication 35
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
20/21
Annual Review
Genomics and
Human Geneti
Volume 8, 2007Contents
Human Evolution and Its Relevance for Genetic Epidemiology
Luigi Luca Cavalli-Sforza 1
Gene Duplication: A Drive for Phenotypic Diversity and Cause of
Human Disease
Bernard Conrad and Stylianos E. Antonarakis
17
DNA Strand Break Repair and Human Genetic Disease
Peter J. McKinnon and Keith W. Caldecott 37
The Genetic Lexicon of Dyslexia
Silvia Paracchini, Thomas Scerri, and Anthony P. Monaco 57
Applications of RNA Interference in Mammalian Systems
Scott E. Martin and Natasha J. Caplen 81
The Pathophysiology of Fragile X Syndrome
Olga Penagarikano, Jennifer G. Mulle, and Stephen T. Warren
109Mapping, Fine Mapping, and Molecular Dissection of Quantitative
Trait Loci in Domestic Animals
Michel Georges 131
Host Genetics of Mycobacterial Diseases in Mice and Men:
Forward Genetic Studies of BCG-osis and Tuberculosis
A. Fortin, L. Abel, J.L. Casanova, and P. Gros 163
Computation and Analysis of Genomic Multi-Sequence Alignments
Mathieu Blanchette 193
microRNAs in Vertebrate Physiology and Human Disease
Tsung-Cheng Chang and Joshua T. Mendell 215
Repetitive Sequences in Complex Genomes: Structure and Evolution
Jerzy Jurka, Vladimir V. Kapitonov, Oleksiy Kohany, and Michael V. Jurka 241
Congenital Disorders of Glycosylation: A Rapidly Expanding Disease Family
Jaak Jaeken and Gert Matthijs 261
v
-
8/13/2019 Conrad Et Al 2007 Gene Duplication
21/21
Annotating Noncoding RNA Genes
Sam Griffiths-Jones
Using Genomics to Study How Chromatin Influences Gene Expression
Douglas R. Higgs, Douglas Vernimmen, Jim Hughes, and Richard Gibbons
Multistage Sampling for Genetic Studies
Robert C. Elston, Danyu Lin, and Gang Zheng
The Uneasy Ethical and Legal Underpinnings of Large-Scale
Genomic Biobanks
Henry T. Greely
Indexes
Cumulative Index of Contributing Authors, Volumes 18
Cumulative Index of Chapter Titles, Volumes 18
Errata
An online log of corrections toAnnual Review of Genomics and Human Genetics
chapters may be found at http://genom.annualreviews.org/