Articles
www.thelancet.com Published online February 28, 2013 http://dx.doi.org/10.1016/S0140-6736(12)62129-1 1
Identification of risk loci with shared effects on five major
psychiatric disorders: a genome-wide analysis
Cross-Disorder Group of the Psychiatric Genomics Consortium*
Summary
Background Findings from family and twin studies suggest that genetic contributions to psychiatric disorders do not
in all cases map to present diagnostic categories. We aimed to identify specific variants underlying genetic effects
shared between the five disorders in the Psychiatric Genomics Consortium: autism spectrum disorder, attention
deficit-hyperactivity disorder, bipolar disorder, major depressive disorder, and schizophrenia.
Methods We analysed genome-wide single-nucleotide polymorphism (SNP) data for the five disorders in 33 332 cases
and 27 888 controls of European ancestory. To characterise allelic effects on each disorder, we applied a multinomial
logistic regression procedure with model selection to identify the best-fitting model of relations between genotype
and phenotype. We examined cross-disorder effects of genome-wide significant loci previously identified for bipolar
disorder and schizophrenia, and used polygenic risk-score analysis to examine such effects from a broader set of
common variants. We undertook pathway analyses to establish the biological associations underlying genetic overlap
for the five disorders. We used enrichment analysis of expression quantitative trait loci (eQTL) data to assess whether
SNPs with cross-disorder association were enriched for regulatory SNPs in post-mortem brain-tissue samples.
Findings SNPs at four loci surpassed the cutoff for genome-wide significance (p<5×10–⁸) in the primary analysis:
regions on chromosomes 3p21 and 10q24, and SNPs within two L-type voltage-gated calcium channel subunits,
CACNA1C and CACNB2. Model selection analysis supported effects of these loci for several disorders. Loci previously
associated with bipolar disorder or schizophrenia had variable diagnostic specificity. Polygenic risk scores showed
cross-disorder associations, notably between adult-onset disorders. Pathway analysis supported a role for calcium
channel signalling genes for all five disorders. Finally, SNPs with evidence of cross-disorder association were enriched
for brain eQTL markers.
Interpretation Our findings show that specific SNPs are associated with a range of psychiatric disorders of childhood
onset or adult onset. In particular, variation in calcium-channel activity genes seems to have pleiotropic effects on
psychopathology. These results provide evidence relevant to the goal of moving beyond descriptive syndromes in
psychiatry, and towards a nosology informed by disease cause.
Funding National Institute of Mental Health.
Introduction
Psychiatric nosology arose in central Europe towards the
end of the 19th century, in particular with Kraepelin’s
foundational distinction between dementia praecox
(schizophrenia) and manic depressive insanity.1 The
distinction between bipolar illness and unipolar (major)
depression was first proposed in the late 1950s and
became increasingly widely accepted. The major syn
dromes—especially schizophrenia, bipolar disorder, and
major depression—were differentiated on the basis of
their symptom patterns and course of illness. At the
same time, clinical features such as psychosis, mood
dysregulation, and cognitive impairments were known
to transcend diagnostic categories. Doubt remains
about the boundaries between the syndromes and the
degree to which they signify entirely distinct entities,
disorders that have overlapping foundations, or different
variants of one underlying disease. Such debates have
inten si fied with syndromes described subsequently,
including autism spectrum disorders and attention
deficithyperactivity disorder.
The pathogenic mechanisms of psychiatric disorders
are largely unknown, so diagnostic boundaries are
difficult to define. Genetic risk factors are important in
the causation of all major psychiatric disorders,2 and
genetic strategies are widely used to assess potential
overlaps. The imminent revision of psychiatric classifi
cations in the Diagnostic and Statistical Manual of
Mental Disorders (DSM) and the International Classifi
cation of Diseases (ICD) has reinvigorated debate about
the validity of diagnostic boundaries. With increasing
availability of large genomewide genotype data for
several psychiatric disorders, shared cause can now be
examined at a molecular level.
We formed the Psychiatric Genomics Consortium
(PGC) in 2007, to undertake metaanalyses of genome
wide association studies (GWAS) for psychiatric dis
orders and, so far, the consortium has incorporated
GWAS data from more than 19 countries for schizo
phrenia, bipolar disorder, major depressive disorder,
attention deficithyperactivity disorder, and autism
spectrum disorders. Previous research has suggested
Published Online
February 28, 2013
http://dx.doi.org/10.1016/
S0140-6736(12)62129-1
See Online/Comment
http://dx.doi.org/10.1016/
S0140-6736(13)60223-8
*Members listed at end of paper
Correspondence to:
Dr Jordan W Smoller, Simches
Research Building,
Massachusetts General Hospital,
Boston, MA 02114, USA
jsmoller@hms.harvard.edu
Articles
2 www.thelancet.com Published online February 28, 2013 http://dx.doi.org/10.1016/S0140-6736(12)62129-1
varying degrees of overlap in familial and genetic liability
for pairs of these disorders. For example, some findings3,4
from family and twin studies support diagnostic boun
daries between schizophrenia and bipolar disorder and
bipolar disorder and major depressive disorder, but also
suggest correlations in familial and genetic liabili ties.3,5
Several molecular variants confer risk of both schizo
phrenia and bipolar disorder.6–8 Autism was once known
as childhood schizophrenia and the two disorders were
not clearly differentiated until the 1970s. Findings from
the past few years have emphasised phenotypic and
genetic overlap between autism spec trum disorders and
schizophrenia,9,10 including identifi cation of copy number
variants conferring risk of both.11 Findings from family,
twin, and molecular studies12–15 suggest some genetic
overlap between autism spectrum disorder and attention
deficithyperactivity disorder.
In this first report from the PGC CrossDisorder Group,
we analyse data on genomewide singlenucleotide poly
morphism (SNP) for the five PGC dis orders to answer two
questions. First, what information emerges when all five
disorders are examined in one GWAS? When risk is
correlated across disorders, pooled analyses will be better
powered than individualdisorder analyses to detect risk
loci. Second, what are the crossdisorder effects of variants
already identified as being associated with a specific
psychiatric disorder in previous PGC analyses? We aimed
to examine the genetic relation between the five psychiatric
disorders with the expectation that findings will ultimately
inform psychiatric nosology, identify potential neuro
biological mechanisms predisposing to specific clinical
presentations, and generate new models for prevention
and treatment.
Methods
Samples and genotypes
The sample for these analyses consisted of cases,
controls, and familybased samples assembled for
previous genomewide PGC megaanalyses of individual
level data.6,7,16,17 Cases and controls were not related. For
the familybased samples, we matched alleles transmitted
to affected offspring (trio cases) with untransmitted
alleles (pseudocontrols). We estimated the identityby
descent relation for all pairs of individuals to identify any
duplicate individuals in the component datasets. When
duplicates were detected, one member of each set was
retained. We then randomly allocated these individuals,
with a random number generator, to a disorder case
control dataset. Sample sizes differ from previous reports
because of this allocation of overlapping individuals. All
patients were of European ancestory and met criteria
from the DSM third edition revised or fourth edition for
the primary disorder of interest.
To ensure comparability between samples, raw geno
type and phenotype data for each study were uploaded
to a central server and processed through the same
quality control, imputation, and analysis process
(appendix).6,7 We analysed imputed SNP dosages from
1 250 922 autosomal SNPs.
Statistical analysis
In the primary analysis, we combined effects of each
disease analysis by a metaanalytic approach that applied
a weighted Zscore,18 in which weights equalled the
inverse of the regression coefficient’s standard error.
This strategy assumed a fixedeffects model, with weights
indicating the sample size of the diseasespecific studies.
In a second analytical approach, we did a fivedegreeof
freedom test by summing the χ² values for each
individual disease metaanalysis. Unlike our primary
analysis, this model did not assume that all diseases had
the same direction of effect and could detect allelic effects
that increase risk for some diseases and decrease risk for
others. The appendix describes statistical methods and
results, including the handling of trios and population
stratification. We also examined loci that previously
achieved genomewide significance in PGC meta
analyses of schizophrenia and bipolar disorder.6,7
To characterise the specificity of the allelic effects for
our main findings, we examined the association evidence
in three ways: we generated forest plots of the disorder
beta coefficients with 95% CIs; we calculated a hetero
geneity p value for the disorderspecific effects con
tributing to the overall statistics for metaanalytic
association; and we undertook a multinomial logistic
regression procedure with model selection19 for each
main SNP for all five disorders to assess the pattern of
phenotypic effects (appendix pp 8–11). To compare the fit
of various models of genotype–phenotype associations,
we applied established goodnessoffit metrics (the
Bayesian information criteria and the Akaike information
criteria). We report the bestfitting model by Bayesian
criteria and show results of both metrics for a range of
models (appendix pp 38–45, 51–61).
To examine shared polygenic risk at an aggregate level
between pairs of diagnoses, we used riskscore profiling as
previously described.8 For each pair, we selected one
disorder as a discovery dataset and the other as a target
dataset and calculated the proportion of variance in the
target set explained by risk scores from the discovery set
with a range of statistical cutoffs for SNP inclusion in the
score (appendix p 13). To assess the role of specific
biological systems in the pathogenesis of the five disorders,
we did pathway and eQTL analyses. Pathway analysis was
by intervalbased enrichment analysis (INRICH) for the
full dataset consisting of linkage disequilibrium segments
containing signals with association p<10–³ in the primary
metaanalysis. INRICH accounts for poten tial genomic
confounding factors, such as variable gene and pathway
sizes, SNP density, linkage disequilibrium, and physical
clustering of biologically related genes (appendix pp 14–16).
We did eQTL enrichment analysis20 to assess whether
SNPs associated with five psychiatric disorders were
enriched for regulatory SNPs in postmortem brain tissue
See Online for appendix
For INRICH see http://atgu.mgh.
harvard.edu/inrich
Articles
www.thelancet.com Published online February 28, 2013 http://dx.doi.org/10.1016/S0140-6736(12)62129-1 3
samples compared with those with no association.21,22 To
assess the specificity of this finding, we also examined
eQTL datasets from three nonbraintissue types: liver,23
skin,24 and lymphoblastoid cell lines25 (appendix pp 17–21).
Role of the funding source
The sponsor of the study had no role in study design,
data collection, data analysis, data interpretation, or
writing of the report. The corresponding author had full
access to all the data and had final responsibility for the
decision to submit for publication.
Results
The final dataset consisted of 33 332 cases and 27 888 con
trols (including pseudocontrols formed from non
transmitted alleles) distributed among the five disorder
groups: autism spectrum disorders (4788 trio cases,
4788 trio pseudocontrols, 161 cases, 526 controls),
attention deficithyperactivity disorder (1947 trio cases,
1947 trio pseudocontrols, 840 cases, 688 controls),
bipolar disorder (6990 cases, 4820 con trols), major
depressive disorder (9227 cases, 7383 con trols), and
schizophrenia (9379 cases, 7736 controls). The results of
the primary fixedeffects metaanalysis for all five
disorders, incorp orating seven multidimensional scaling
components as covariates, yielded a genomic control
value of λ=1·167. The λ1000 (λ rescaled to a sample of
1000 cases and 1000 controls) was 1·005 (appendix p 22).
In view of evidence for substantial polygenic con
tributions to common psy chiatric disorders, this
estimate probably shows the aggregate small effect of a
large number of risk variants, although a moderate
Figure 1: Manhattan plot of primary fixed-effects meta-analysis
Horizontal line shows threshold for genome-wide significance (p<5×10⁻⁸).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
1
2
3
4
5
6
7
8
9
10
11
12
13
14
–lo
g1
0 (
p
va
lu
e)
Chromosome
MIR137 (+1) ITIH3 (+35) HINT1 (+7) MHC (369) ZFPM2 CACNB2 CACNA1C CPNE7 (+12) TCF4
NEURL (+25)SYNE1 FPR2 (+11)
Chromosome Base-pair
position*
Nearest gene Alleles Frequency† Imputation quality
score (INFO)
p value OR (95% CI)‡ Heterogeneity
p value
Best-fit model
(BIC)§
rs2535629 3 52808259 ITIH3 (+ many) G/A 0·651 0·942 2·54×10⁻¹² 1·10 (1·07–1·12) 0·27 Five disorder¶
rs11191454 10 104649994 AS3MT (+ many) A/G 0·910 1·01 1·39×10⁻⁸ 1·13 (1·08 –1·18) 0·32 Five disorder¶
rs1024582 12 2272507 CACNA1C A/G 0·337 0·98 1·87×10⁻⁸ 1·07 (1·05-1·10) 0·0057 BPD, schizophrenia
rs2799573 10 18641934 CACNB2 T/C 0·715 0·825 4·29×10⁻⁸ 1·08 (1·05-1·12) 0·57 Five disorder¶
Most strongly associated single-nucleotide polymorphisms (SNP) in associated region after clumping—ie, grouping SNPs within 250 kb of the index SNP that have r²>0·2 with the index SNP as implemented in
PLINK. OR=odds ratio. BIC=Bayesian information criteria. BPD=bipolar disorder. *Detected with University of California Santa Cruz Genome Browser (version hg18). †Risk allele frequency in controls. ‡Estimated
OR from multinomial logistic regression used in the modelling analysis. §Best-fit multinomial logistic model by BIC criteria; appendix pp 38–45 provide a comparison of BIC and Akaike information criteria across
models. ¶Best-fit model supports an effect on all five disorders.
Table 1: Five disorder meta-analysis results for regions with p<5×10⁻⁸
Articles
4 www.thelancet.com Published online February 28, 2013 http://dx.doi.org/10.1016/S0140-6736(12)62129-1
degree of population stratification or technical bias
cannot be excluded.
Figure 1 shows the Manhattan plot of the primary
results. Four independent regions contained SNPs
with p<5×10–⁸ (table 1; appendix pp 34–35, 25–33). The
strongest association signal was on chromosome 3 at an
intronic SNP within ITIH3 (table 1). This SNP is in
linkage disequilibrium with SNPs encompassing several
Figure 2: Association results and forest plots showing effect size for genome-wide significant loci by disorder
Data in parentheses are numbers of cases or controls. Het_p=p value for the heterogeneity test. Het_I=heterogeneity test statistic. IQS=imputation quality score
(INFO). ln(OR)=log of the odds ratio (OR). F=frequency. SE=standard error of the log OR. ADHD=attention deficit-hyperactivity disorder. ASD=autism spectrum
disorders. BPD=bipolar disorder. MDD=major depressive disorder. *Number of studies in which the variant was directly genotyped.
rs2535629 G/A 3:52808259
ADHD
ASD
BPD
MDD
Schizophrenia
All
Studies*
0
0
3
1
6
10
In(OR)
0·0535
0·0495
0·139
0·0913
0·0993
0·0908
SE
0·0418
0·0306
0·0308
0·0247
0·0249
0·013
F (cases)
0·339 (2787)
0·334 (4949)
0·325 (6990)
0·336 (9227)
0·332 (9379)
0·333 (33 332)
F (controls)
0·350 (2635)
0·347 (5314)
0·348 (4820)
0·351 (7383)
0·350 (7736)
0·349 (27 888)
IQS
0·94
0·95
0·90
0·97
0·94
0·94
p value
0·201
0·196
6·61×10–6
0·000216
6·71×10–5
2·54×10–12
A
het_p: het_I: –15·70·27
rs11191454 A/G 10:104649994
ADHD
ASD
BPD
MDD
Schizophrenia
All
Studies*
1
1
7
1
12
22
In(OR)
0·0649
0·0733
0·127
0·098
0·19
0·12
SE
0·0698
0·05
0·0495
0·0406
0·0409
0·0212
F (cases)
0·918 (2787)
0·915 (4949)
0·920 (6990)
0·916 (9227)
0·921 (9379)
0·918 (33 332)
F (controls)
0·914 (2635)
0·910 (5314)
0·912 (4820)
0·909 (7383)
0·908 (7736)
0·910 (27 888)
IQS
0·99
1·01
1·01
1·00
1·03
1·01
p value
0·355
0·143
0·0107
0·0156
3·48×10–6
1·39×10–8
B
het_p: het_I: –27·10·32
rs1024582 A/G 12:2272507
ADHD
ASD
BPD
MDD
Schizophrenia
All
Studies*
0
2
0
0
0
2
In(OR)
0·0639
0·00399
0·144
0·0383
0·103
0·0714
SE
0·0418
0·0301
0·0296
0·0244
0·0244
0·0127
F (cases)
0·342 (2787)
0·331 (4949)
0·362 (6990)
0·344 (9227)
0·357 (9379)
0·349 (33 332)
F (controls)
0·328 (2635)
0·333 (5314)
0·335 (4820)
0·341 (7383)
0·340 (7736)
0·337 (27 888)
IQS
0·96
0·99
0·98
0·98
0·98
0·98
p value
0·127
0·892
1·12×10–6
0·12
2·84×10–5
1·87×10–8
C
het_p: het_I: 58·80·01
0 0·05–0·05 0·10 0·15 0·20 0·25
In(OR), 95% CI
rs2799573 T/C 10:18641934
ADHD
ASD
BPD
MDD
Schizophrenia
All
Studies*
2
6
3
6
4
21
In(OR)
0·132
0·0402
0·0667
0·088
0·0935
0·0807
SE
0·0489
0·0337
0·0356
0·0268
0·0296
0·0147
F (cases)
0·745 (2787)
0·739 (4949)
0·723 (6990)
0·725 (9227)
0·724 (9379)
0·728 (33 332)
F (controls)
0·726 (2635)
0·734 (5314)
0·709 (4820)
0·707 (7383)
0·711 (7736)
0·715 (27 888)
IQS
0·82
0·91
0·74
0·92
0·73
0·82
p value
0·00691
0·238
0·0617
0·00108
0·00161
4·29×10–8
D
het_p: het_I: 0·00·56
Articles
www.thelancet.com Published online February 28, 2013 http://dx.doi.org/10.1016/S0140-6736(12)62129-1 5
genes across a 1 Mb region (appendix p 22). The second
strongest signal was in an intron of AS3MT on chromo
some 10q24 (table 1). Linkage disequilibrium around this
associated region encompasses several genes including
CNNM2. We also recorded genomewide significant
association within CACNA1C, and finally detected
significant association to a second locus on chromo
some 10 in an intron of CACNB2 (table 1). We undertook
conditional analyses to assess evidence for multirisk loci
in a region. In these analyses, we included the most
strongly associated or peak SNP plus any SNPs within
1·5 Mb of the peak SNP with association p values less
than 10–⁴ and r² less than 0·2 with the peak SNP based on
HapMap 3 CEU data. For the chromosome 3p21 region,
and regions CACNA1C and CACNB2, no additional
independent association signals were de tected. For
the chromosome 10q24 region, an additional SNP
(rs11191732), about 600 kb from the peak SNP, showed
association after conditioning on the peak SNP
(rs11191454) with a p value of 6·60×10–⁶ before con
ditioning and 3·88×10–⁵ after conditioning. Several loci
previously implicated in PGC analyses of schizophrenia
and bipolar disorder6,7 showed evidence for association in
the crossdisorder analysis, despite not exceeding the
cutoff for genomewide significance (appendix pp 23–24).
These loci include one near MIR137, TCF4, the MHC
region on chromosome 6, and SYNE1 (appendix
pp 23–24).