ARTICLE
Received 29 May 2013 | Accepted 19 Sep 2013 | Published 18 Oct 2013
Draft genome of the kiwifruit Actinidia chinensis
Shengxiong Huang1,2, Jian Ding3, Dejing Deng4, Wei Tang2, Honghe Sun5, Dongyuan Liu4, Lei Zhang6,
Xiangli Niu1, Xia Zhang2, Meng Meng2, Jinde Yu2, Jia Liu1, Yi Han1, Wei Shi1, Danfeng Zhang1, Shuqing Cao1,
Zhaojun Wei1, Yongliang Cui3, Yanhua Xia4, Huaping Zeng4, Kan Bao5, Lin Lin2, Ya Min2, Hua Zhang1,
Min Miao1,2, Xiaofeng Tang1,2, Yunye Zhu1, Yuan Sui1, Guangwei Li1, Hanju Sun1, Junyang Yue1, Jiaqi Sun2,
Fangfang Liu2, Liangqiang Zhou3, Lin Lei3, Xiaoqin Zheng3, Ming Liu4, Long Huang4, Jun Song4,
Chunhua Xu4, Jiewei Li7, Kaiyu Ye7, Silin Zhong5,8, Bao-Rong Lu9, Guanghua He10, Fangming Xiao11,
Hui-Li Wang1, Hongkun Zheng4, Zhangjun Fei5,12 & Yongsheng Liu1,2
The kiwifruit (Actinidia chinensis) is an economically and nutritionally important fruit crop
with remarkably high vitamin C content. Here we report the draft genome sequence of
a heterozygous kiwifruit, assembled from B140-fold next-generation sequencing data. The
assembled genome has a total length of 616.1Mb and contains 39,040 genes. Comparative
genomic analysis reveals that the kiwifruit has undergone an ancient hexaploidization event
(g) shared by core eudicots and two more recent whole-genome duplication events. Both
recent duplication events occurred after the divergence of kiwifruit from tomato and potato
and have contributed to the neofunctionalization of genes involved in regulating important
kiwifruit characteristics, such as fruit vitamin C, flavonoid and carotenoid metabolism. As the
first sequenced species in the Ericales, the kiwifruit genome sequence provides a valuable
resource not only for biological discovery and crop improvement but also for evolutionary and
comparative genomics analysis, particularly in the asterid lineage.
DOI: 10.1038/ncomms3640 OPEN
1 School of Biotechnology and Food Engineering, Hefei University of Technology, Hefei 230009, China. 2Ministry of Education Key Laboratory for Bio-resource
and Eco-environment, College of Life Science, State Key Laboratory of Hydraulics and Mountain River Engineering, Sichuan University, Chengdu 610064,
China. 3 Sichuan Academy of Natural Resource Sciences, Chengdu 610015, China. 4 Biomarker Technologies Corporation, Beijing 101300, China. 5 Boyce
Thompson Institute for Plant Research, Tower Road, Cornell University, Ithaca, New York 14853, USA. 6 Institute of Fruit & Tea, Hubei Academy of Agricultural
Sciences, Wuhan 430064, China. 7 Guangxi Institute of Botany, Chinese Academy of Sciences, Guilin 541006, China. 8 Center for Soybean Research of the
State Key Laboratory of Agrobiotechnology, School of Life Sciences, Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China. 9 The
Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, Institute of Biodiversity Science, Fudan University, Shanghai
200433, China. 10 College of Agronomy and Life Science, Southwest University, Chongqing 400716, China. 11 Department of Plant, Soil, and Entomological
Sciences, University of Idaho, Moscow, Idaho 83844, USA. 12 USDA Robert W. Holley Center for Agriculture and Health, Tower Road, Ithaca, New York
14853, USA. Correspondence and requests for materials should be addressed to Y.L. (email: liuyongsheng1122@hfut.edu.cn) or to Z.F. (email:
zf25@cornell.edu) or to H.Z. (email: zhenghk@biomarker.com.cn).
NATURE COMMUNICATIONS | 4:2640 |DOI: 10.1038/ncomms3640 |www.nature.com/naturecommunications 1
& 2013 Macmillan Publishers Limited. All rights reserved.
liujc
高亮
liujc
高亮
A
ctinidiaceae, the basal family within the Ericales, consists
of the genera Actinidia, Saurauia and Clematoclethra1.
The genus Actinidia, commonly known as kiwifruit,
includes several economically important horticultural species,
such as Actinidia chinensis Planchon, A. deliciosa (A. chinensis
var. deliciosa A. Chevalier), A. arguta (Siebold and Zuccarini)
Planchon ex Miquel and A. eriantha Bentham2. Approximately
54 species and 75 taxa have been described in Actinidia3, all of
which are perennial, deciduous and dioecious plants with a
climbing or straggling growth habit. The kiwifruit species are
often reticulate polyploids with a base chromosome number of
x¼ 29 (ref. 4).
The kiwifruit has long been called ‘the king of fruits’ because of
its remarkably high vitamin C content and balanced nutritional
composition of minerals, dietary fibre and other health-beneficial
metabolites. Extensive studies on the metabolic accumulation of
vitamin C, carotenoids and flavonoids have been reported in
kiwifruits5–13. The centre of origin of kiwifruit is in the
mountains and ranges of southwestern China. The kiwifruit has
a short history of domestication, starting in the early 20th century
when its seeds were introduced into New Zealand14. Through
decades of domestication and substantial efforts for selection
from wild kiwifruits, numerous varieties have been developed and
kiwifruits have become an important fresh fruit worldwide with
an annual production of 1.44 million tons in 2011 (http://
faostat.fao.org).
Despite the availability of an extensive expressed sequenced tag
(EST) database15 and several genetic maps16,17, whole-genome
sequence resources for the kiwifruit, which are critical for its
breeding and improvement, are very limited. Kiwifruit belongs to
the order Ericales in the asterid lineage. Currently, no genomes
have been sequenced for species in the Ericales and in the asterid
lineage only the genomes of Solanaceae species in the order
Euasterids I, including the tomato18 and potato19, have been
sequenced.
Here we sequence and analyse the genome of a heterozygous
kiwifruit, ‘Hongyang’ (A. chinensis), which is widely grown in
China. The availability of this genome sequence not only provides
insight into the underlying molecular basis of specific agronomi-
cally important traits of kiwifruit and its wild relatives but
also presents a valuable resource for elucidating evolutionary
processes in the asterid lineage.
Results
Genome sequencing and assembly. One female individual of a
Chinese kiwifruit cultivar ‘Hongyang’ was selected for whole-
genome sequencing. ‘Hongyang’ is a heterozygous diploid
(2n¼ 2x¼ 58) that is derived from clonally selected wild germ-
plasm in central China and has not been subjected to further
selection and breeding20. Its oval shaped fruit has a hairy,
greenish-brown skin, a slight green or golden outer pericarp and a
red-flesh inner pericarp with rows of tiny, black, edible seeds. Its
fruit is highly nutritional containing abundant levels of ascorbic
acid (vitamin C), carotenoids, flavonoids and anthocyanins
(Supplementary Table S1).
A total of 105.8 Gb high-quality sequences (Supplementary
Table S2) were generated using the Illumina HiSeq 2000 system.
This represented approximately a 140� coverage of the kiwifruit
genome with an estimated size of 758Mb based on the flow
cytometry analysis21. De novo assembly of these sequences
employing Allpaths-LG22 yielded a draft genome of 616.1Mb,
representing 81.3% of the kiwifruit genome (Table 1). The
genome assembly consists of 21,713 contigs and 5,110 scaffolds
(42 kb), with N50 sizes of 58.8 and 646.8 kb for contigs and
scaffolds, respectively (Table 1). To determine scaffold placement
on kiwifruit pseudochromosomes, a high-density genetic map
was constructed using an F1 population derived from the cross
between ‘Hongyang-MS-01’ (male) and A. eriantha ‘Jiangshan-
jiao’ (female). Genotyping of each individual in the F1 population
was determined using SLAF-seq23. The final map spanned
5,504.5 cM across 29 linkage groups and was composed of
4,301 single nucleotide polymorphism (SNP) markers, with a
mean marker density of 1.28 cM per marker. Using 3,379 markers
that were uniquely aligned to the assembled scaffolds, a total of
853 scaffolds were anchored to the 29 kiwifruit pseudo-
chromosomes, comprising 73.4% (452.4Mb) of the kiwifruit
genome assembly (Fig. 1). Of the 853 anchored scaffolds, 491
could be oriented (333.6Mb, 73.7% of the anchored sequences).
The GC content of the assembled genome was 35.2%, similar to
that of the genomes of tomato (34%)18 and potato (34.8%)19,
which to date are the evolutionarily closest species of kiwifruit
that have genomes sequenced (Supplementary Fig. S1). Further-
more, we detected heterozygous sites by mapping the reads back
to the assembled genome, revealing a high level of heterozygosity
(0.536%) in ‘Hongyang’, which was further supported by the
K-mer distribution of the genomic reads (Supplementary Fig. S2).
To evaluate the quality of the assembled genome, an
independent Illumina library with an insert size of 500 bp was
constructed and sequenced. The resulting reads were mapped to
the assembled genome to identify homozygous SNPs and
structure variations (SVs), which represent potential base errors
and misassemblies in the genome, respectively. The analyses
indicated that the assembly has a single base error rate of 0.03%,
which is comparable to the rate of the tomato genome (0.02%)18.
In addition, only 24 SVs were identified (Supplementary
Table S3), indicating a very low frequency of misassemblies in
the genome. The quality of the assembly was further assessed by
aligning the EST sequences from the genus Actinidia15 to the
assembled genome. The analysis indicated that the assembly
contained 97.3% of the 81,956 ESTs derived from A. chinensis,
90.9% of the 83,924 ESTs from A. deliciosa and 94.3% of the
19,574 ESTs from A. eriantha (Supplementary Table S4).
Together, these analyses supported the high quality of our
genome assembly.
Repetitive sequence annotation. We identified a total of
B222Mb (36% of the assembly) of repetitive sequences in the
kiwifruit genome. The content of repetitive sequences in the
kiwifruit genome appears to be much less than that in tomato
(63.2%)18 and potato (62.2%)19, whereas it is more than that in
Arabidopsis (14%)24 and Thellungiella parvula (7.5%)25.
Comparative analysis with known repeats in Repbase26 and
plant repeat database27 indicated that 68.8% of the repetitive
Table 1 | Kiwifruit genome assembly statistics.
Contig size Contig
number
Scaffold
size
Scaffold
number
N90 11,574 11,427 122,658 1,053
N80 23,123 7,788 256,012 725
N70 34,283 5,660 376,247 530
N60 45,944 4,137 496,035 387
N50 58,864 2,977 646,786 280
Largest 423,496 3,410,229
Average 22,612 80,035
Total size 604,217,145 616,114,069
Total number
(4200bp)
26,721 7,698
Total number (42 kb) 21,713 5,106
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3640
2 NATURE COMMUNICATIONS | 4:2640 | DOI: 10.1038/ncomms3640 |www.nature.com/naturecommunications
& 2013 Macmillan Publishers Limited. All rights reserved.
liujc
高亮
sequences in the kiwifruit genome could be classified and
annotated. A large portion of the unclassified repetitive
sequences might be kiwifruit-specific. Retrotransposons made
up the majority of the repeats, among which the long terminal
repeat (LTR) family was the most abundant (B13.4% of the
assembly). Within the LTR family, Copia and Gypsy represented
the two most abundant subfamilies. In addition, DNA
transposons accounted for B4.75% of the genome assembly
(Supplementary Table S5).
Gene prediction and annotation. Using the EST sequences of
A. chinensis15 and RNA-seq data we have generated from
A. chinensis leaf and fruits (Supplementary Fig. S3), integrated
with ab initio gene predictions and homologous sequence
searching, we predicted a total of 39,040 protein-encoding genes
with an average coding sequence length of 1,073 bp and 4.6 exons
per gene. Among these genes, 74.5 and 82.3% had significant
similarities to sequences in the non-redundant nucleotide and
protein databases in NCBI, respectively. Additionally, 37.4, 66.9,
LG1 Scaffold
LG16 Scaffold LG17 Scaffold LG18 Scaffold LG19 Scaffold LG20 Scaffold LG21 Scaffold LG22 Scaffold LG23 Scaffold LG24 Scaffold LG25 Scaffold LG26 Scaffold LG27 Scaffold LG28 Scaffold LG29 Scaffold
LG2 Scaffold LG3 Scaffold LG4 Scaffold LG5 Scaffold LG6 Scaffold LG7 Scaffold LG8 Scaffold LG9 Scaffold LG10 Scaffold LG11 Scaffold LG12 Scaffold LG13 Scaffold LG14 Scaffold LG15 Scaffold
Figure 1 | Anchoring the Hongyang genome assembly to the diploid kiwifruit reference genetic map. ‘Hongyang’ (A. chinensis) genome scaffolds (blue)
were anchored to the linkage groups (yellow) of the A. chinensis � A. eriantha genetic map with 3,379 SNP markers.
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3640 ARTICLE
NATURE COMMUNICATIONS | 4:2640 |DOI: 10.1038/ncomms3640 |www.nature.com/naturecommunications 3
& 2013 Macmillan Publishers Limited. All rights reserved.
81.9, 61.3 and 81.8% could be annotated using COG, GO,
TrEMBL, Swissprot and KEGG databases, respectively. Further-
more, conserved domains in 465.5% of the predicted protein
sequences could be identified by comparing them against Inter-
Pro and Pfam databases. In addition, a total of 2,438 putative
transcription factors that are distributed in 58 families and 447
transcriptional regulators distributed in 22 families were identi-
fied in the kiwifruit genome (Supplementary Data 1 and 2). In
addition to protein-coding genes, 293 rRNAs, 511 tRNAs, 236
miRNAs, 91 snRNAs and 307 SnoRNAs were also identified.
Comparative analyses between kiwifruit and other plants.
Comparative analyses of the complete gene sets of kiwifruit,
Arabidopsis, rice, grape and tomato were performed. A total of
25,381 genes in the kiwifruit genome were assigned into 13,100
orthologous gene clusters. Among these clusters, 7,985 are com-
mon to all five species, whereas 885 are confined to eudicots
(kiwifruit, Arabidopsis, grape and tomato). Within the eudicots,
337 gene clusters are restricted to plants with flesh fruits (kiwi-
fruit, grape and tomato), whereas 1,455 clusters contain genes
only from kiwifruit (Fig. 2). Further functional characterization
based on GO terms revealed that the 337 flesh fruit-specific
families were highly enriched with genes associated with fruit
quality, including those related to flavonoid, phenylpropanoid,
anthocyanin and oligosaccharide metabolism (Supplementary
Table S6). The kiwifruit-specific families were significantly enri-
ched with genes related to pollen tube reception and specification
of floral organ identity (Supplementary Table S7), both of which
are consistent with the high diversity of sex expression found in
kiwifruit17,28.
Among plants with the sequenced genomes, tomato has the
closest evolutionary relationship to kiwifruit. Consequently, the
largest number of gene clusters (10,849) were shared between
kiwifruit and tomato, representing 82.8 and 82.5% of their
individual total gene clusters, respectively. We then calculated the
evolutionary rate for each of the orthologous gene pairs of
kiwifruit–grape, Arabidopsis–grape and tomato–grape. The
average ratio (o) of non-synonymous (Ka) versus synonymous
(Ks) nucleotide substitution rate in kiwifruit (0.064) was found to
be greater than that in Arabidopsis (0.055) and tomato (0.052),
indicating that diversifying selection may have been stronger in
kiwifruit.
Whole-genome duplication in kiwifruit. Whole-genome dupli-
cation (WGD) followed by gene loss has been found in most
eudicots and is regarded as the major evolutionary force that gives
rise to gene neofunctionalization in both plants and animals.
Within the kiwifruit genome, 588 paralogous relationships were
identified, covering 46% of the genome. We then compared the
kiwifruit genome sequence to that of tomato, potato and grape,
respectively, and identified a large number of syntenic regions
(Fig. 3a). The distribution of 4DTv (transversions at fourfold
degenerate sites) and Ks values of homologous pairs in these
syntenic regions, as well as the mean Ks values of individual
syntenic blocks indicated that an ancient WGD (the g event),
which is shared by core eudicots, and two recent WGD events
had occurred in the evolutionary history of kiwifruit (Fig. 3b and
Supplementary Fig. S4a–c). In addition, using the method
described in Simillion et al.29, we were able to group kiwifruit
syntenic blocks into three age classes based on their mean Ks
values, further supporting the ancient triplication and the two
recent WGD events in kiwifruit (Supplementary Fig. S4d). The
two recent WGD events, Ad-a and Ad-b, were estimated to have
occurred B26.7 and 72.9–101.4 million years ago, respectively,
based on Ks of paralogous genes. These results are consistent with
previous findings based on the EST analysis30. Both Ad-a and
Ad-b events occurred after the kiwifruit–tomato or kiwifruit–
potato divergence (Fig. 3b).
The relationship of orthologous genes in syntenic blocks
between kiwifruit and grape was further analysed. We found that
55.8% of kiwifruit gene models are in blocks that are orthologous
to one grape region, collectively covering 73.6% of the grape gene
space. Among these grape genomic regions, 19.1% have one
orthologous region in kiwifruit, 20.5% have two, 23% have three,
19.3% have four, 11% have five, 4.4% have six, 2.2% have seven
and a few (o1%) have eight, nine and ten. This pattern is similar
to that of Arabidopsis31, whose genome has also undergone two
WGD (At-a and At-b) following the ancient g triplication. These
data further supported the occurrence of the two recent WGD
events in kiwifruit, followed by extensive gene loss.
Gene expansion and neofunctionalization in kiwifruit. Kiwi-
fruit is well known for its high nutritional value because of the
extremely abundant content of ascorbic acid (vitamin C) in them.
We investigated and compared genes involved in the ascorbic
acid biosynthesis and recycling pathway in kiwifruit, Arabidopsis,
grape, sweet orange and tomato. Although we found no expan-
sion in genes from the L-galactose pathway that forms the major
route to vitamin C biosynthesis in kiwifruit32, we did find that
other gene families involved in ascorbic acid biosynthesis,
including Alase (aldonolactonase), APX (L-ascorbate
peroxidase) and MIOX (myo-inositol oxygenase), and genes
responsible for ascorbic acid regeneration from its oxidized
forms, including MDHAR (monohydroascorbate reductase),
exhibited an expansion in kiwifruit (Supplementary Table S8
and Supplementary Fig. S5). Phylogenetic analyses of genes in
these expanded families, combined with results from the synteny
analyses, indicated that the two recent WGDs in kiwifruit resulted
in additional gene family members that evolved to contribute
to the high vitamin C accumulation in the fruit of kiwifruit
(Fig. 4 and Supplementary Figs S6–S8). Most of the expanded
genes in the ascorbic acid biosynthesis and recycling pathway
were expressed in both leaves and fruits of kiwifruit, with a large
portion being expressed higher in fruits (especially immature
fruits) than in leaves (Supplementary Table S9). This is consistent
Tomato
25202
14130
1108
369
539
306
7985
885 292
187
764
1214
104
108
79
917
151
172
274 337
80
348
69
88
88
63
156
101
1455
46
95
61
2126
Rice
27048
13197
Kiwifruit
25381
13100
Grapevine
18839
12988
Arabidopsis
22901
13150
Figure 2 | Venn diagram of orthologous gene families. Five species
(kiwifruit, Arabidopsis, grape, tomato and rice) were used to generate the
Venn diagram based on the gene family cluster analysis.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms3640
4 NATURE COMMUNICATIONS | 4:2640 | DOI: 10.1038/ncomms3640 |www.nature.com/naturecommunications
& 2013 Macmillan Publishers Limited. All rights reserved.
with the high level of ascorbic acid in both fruits and leaves of
‘Hongyang’ and a higher level in immature fruits (Supplementary
Table S1).
Kiwifruit also contains high levels of other important
nutritional compounds including carotenoids, flavonoids and
chlorophylls (Supplementary Table S1). Compared with Arabi-
dopsis, grape, sweet orange and tomato, expansions of gene