Functional and comparative genomic analyses of an
operon involved in fructooligosaccharide utilization
by Lactobacillus acidophilus
Rodolphe Barrangou*, Eric Altermann*, Robert Hutkins†, Raul Cano‡, and Todd R. Klaenhammer*§
*Genomic Sciences Program and Southeast Dairy Foods Research Center, North Carolina State University, Raleigh, NC 27695; †Department of Food Science
and Technology, University of Nebraska, Lincoln, NE 68583-0919; and ‡California Polytechnic State University, Environmental Biotechnology Institute,
San Luis Obispo, CA 93407
Contributed by Todd R. Klaenhammer, May 8, 2003
Lactobacillus acidophilus is a probiotic organism that displays the
ability to use prebiotic compounds such as fructooligosaccharides
(FOS), which stimulate the growth of beneficial commensals in the
gastrointestinal tract. However, little is known about the mecha-
nisms and genes involved in FOS utilization by Lactobacillus spe-
cies. Analysis of the L. acidophilus NCFM genome revealed an msm
locus composed of a transcriptional regulator of the LacI family, a
four-component ATP-binding cassette (ABC) transport system, a
fructosidase, and a sucrose phosphorylase. Transcriptional analysis
of this operon demonstrated that gene expression was induced by
sucrose and FOS but not by glucose or fructose, suggesting some
specificity for nonreadily fermentable sugars. Additionally, expres-
sion was repressed by glucose but not by fructose, suggesting
catabolite repression via two cre-like sequences identified in the
promoter–operator region. Insertional inactivation of the genes
encoding the ABC transporter substrate-binding protein and the
fructosidase reduced the ability of the mutants to grow on FOS.
Comparative analysis of gene architecture within this cluster re-
vealed a high degree of synteny with operons in Streptococcus
mutans and Streptococcus pneumoniae. However, the association
between a fructosidase and an ABC transporter is unusual and may
be specific to L. acidophilus. This is a description of a previously
undescribed gene locus involved in transport and catabolism of
FOS compounds, which can promote competition of beneficial
microorganisms in the human gastrointestinal tract.
The ability of select intestinal microbes to use substratesnondigested by the host may play an important role in their
ability to successfully colonize the mammalian gastrointestinal
(GI) tract. A diverse carbohydrate catabolic potential is associ-
ated with cariogenic activity of Streptococcus mutans in the oral
cavity (1), adaptation of Lactobacillus plantarum to a variety of
environmental niches (2), and residence of Bifidobacterium
longum in the colon (3), illustrating the competitive benefits of
complex sugar utilization. Prebiotics are nondigestible food
ingredients that selectively stimulate the growth and�or activity
of beneficial microbial strains residing in the host intestine (4).
Among sugars that qualify as prebiotics, fructooligosaccharides
(FOS) are a diverse family of fructose polymers used commer-
cially in food products and nutritional supplements that vary in
length and can be either derivatives of simple fructose polymers
or fructose moieties attached to a sucrose molecule. The linkage
and degree of polymerization can vary widely (usually between
2 and 60 moieties), and several names such as inulin, levan,
oligofructose, and neosugars are used accordingly. The average
daily intake of such compounds, originating mainly from wheat,
onion, artichoke, banana, and asparagus (4, 5), is fairly signifi-
cant, with �2.6 g of inulin and 2.5 g of oligofructose consumed
in the average American diet (5). FOS are not digested in the
upper GI tract and can be degraded by a variety of lactic acid
bacteria (6–9), residing in the human lower GI tract (4, 10). FOS
and other oligosaccharides have been shown in vivo to benefi-
cially modulate the composition of the intestinal microbiota
and specifically to increase bifidobacteria and lactobacilli (4, 10,
11). A variety of Lactobacillus acidophilus strains in particular
have been shown to use several polysaccharides and oligo-
saccharides such as arabinogalactan, arabinoxylan, and FOS
(6, 9). Despite the recent interest in FOS utilization, little
information is available about the metabolic pathways and
enzymes responsible for transport and catabolism of such com-
plex sugars in lactobacilli.
In silico analysis of a particular locus within the L. acidophilus
North Carolina Food Microbiology (NCFM) genome revealed
the presence of a gene cluster encoding proteins potentially
involved in prebiotic transport and hydrolysis. This specific
cluster was analyzed computationally and functionally to reveal
the genetic basis for FOS transport and catabolism by L.
acidophilus NCFM.
Materials and Methods
Bacterial Strain and Media Used in This Study.The strain used in this
study is L. acidophilus NCFM (12). Cultures were propagated at
37°C, aerobically in deMan, Rogosa, Sharpe broth (Difco). A
semisynthetic medium consisted of: 1% bactopeptone (wt�vol)
(Difco), 0.5% yeast extract (wt�vol) (Difco), 0.2% dipotassium
phosphate (wt�vol) (Fisher), 0.5% sodium acetate (wt�vol)
(Fisher), 0.2% ammonium citrate (wt�vol) (Sigma), 0.02% mag-
nesium sulfate (wt�vol) (Fisher), 0.005% manganese sulfate
(wt�vol) (Fisher), 0.1% Tween 80 (vol�vol) (Sigma), 0.003%
bromocresol purple (vol�vol) (Fisher), and 1% sugar (wt�vol).
The carbohydrates added were either glucose (dextrose) (Sig-
ma), fructose (Sigma), sucrose (Sigma), or FOS. Two types of
complex sugars were used as FOS: a GFn mix (manufactured by
R. Hutkins, University of Nebraska), consisting of glucose
monomers linked �-1,2 to two, three, or four fructosyl moieties
linked �-2,1, to form kestose (GF2), nystose (GF3), and fructo-
furanosyl-nystose (GF4), respectively; and an Fn mix, Raftilose,
derived from inulin hydrolysis (Orafti). Without carbohydrate
supplementation, the semisynthetic medium was unable to sus-
tain bacterial growth above OD600 nm � 0.2.
Computational Analysis of the Putative Multiple Sugar Metabolism
(msm) Operon. A 10-kbp DNA locus containing a putative msm
operon was identified from the L. acidophilus NCFM genome
sequence. ORF predictions were carried out by four computa-
tional programs: GLIMMER (13, 14), CLONE MANAGER (Scientific
and Educational Software, Durham, NC), the National Center
for Biotechnology Information ORF finder (www.ncbi.nlm.nih.
gov�gorf�gorf.html), and GENOMAX (InforMax, Frederick,
Abbreviations: ABC, ATP-binding cassette; cre, catabolite response element; FOS, fructoo-
ligosaccharides; MSM, multiple sugar metabolism; PTS, phosphotransferase system; LGT,
lateral gene transfer; GI, gastrointestinal; NCFM, North Carolina Food Microbiology.
Data deposition: The sequences reported in this paper have been deposited in the GenBank
database (accession nos. AY172019, AY172020, and AY177419).
§To whom correspondence should be addressed. E-mail: klaenhammer@ncsu.edu.
www.pnas.org�cgi�doi�10.1073�pnas.1332765100 PNAS � July 22, 2003 � vol. 100 � no. 15 � 8957–8962
M
IC
RO
BI
O
LO
G
Y
MD). GLIMMER was previously trained with a set of L. acidophi-
lus genes available in public databases. The predictedORFs were
translated into putative proteins that were submitted to BLASTP
analysis (15).
RNA Isolation and Analysis. Total RNA was isolated by using
TRIzol (GIBCO�BRL), following the supplier’s instructions.
Cells in the exponential phase were harvested by centrifugation
(2 min, 15,800 � g) and cooled on ice. Pellets were resuspended
in TRIzol by vortexing and underwent five cycles of 1-min bead
beating and 1 min on ice. Nucleic acids were subsequently
purified by using three chloroform extractions and precipitated
by using isopropanol and centrifugation for 10 min at 11,600 �
g. The RNA pellet was washed with 70% ethanol and resus-
pended into diethyl pyrocarbonate-treated water. RNA samples
were treated with DNAse I according to the supplier’s instruc-
tions (Boehringer Mannheim). First-strand cDNA was synthe-
sized by using the Invitrogen RT-PCR kit according to the
supplier’s instructions. cDNA products were subsequently am-
plified by using PCR with primers internal to genes of interest.
For RNA slot blots, RNA samples were transferred to nitrocel-
lulose membranes (Bio-Rad) using a slot-blot apparatus (Bio-
Dot SF, Bio-Rad), and the RNAs were UV crosslinked to the
membranes. Blots were probed with DNA fragments generated
by PCR that had been purified from agarose gels (GeneClean III
kit, Midwest Scientific, St. Louis). Probes were labeled with
�-32P with the Amersham Pharmacia Multiprime Kit and con-
sisted of 700- and 750-bp fragments internal to the msmE and
bfrA genes, respectively. Hybridization and washes were carried
out according to the supplier’s instructions (Bio-Dot Microfil-
tration Apparatus, Bio-Rad), and radioactive signals were de-
tected by using a Kodak Biomax film. Primers are listed in Table
2, which is published as supporting information on the PNAS
web site, www.pnas.org.
Comparative Genomic Analysis. A gene cluster bearing a fructosi-
dase gene was selected after computational data-mining of the
L. acidophilus NCFM genome. Additionally, microbial clusters
containing fructosidase EC 3.2.1.26 orthologs or bearing an
ATP-binding cassette (ABC) transport system associated with
an �-galactosidase EC 3.2.1.22 were selected from public data-
bases (National Center for Biotechnology Information, The
Institute for Genomic Research). The sucrose operon is a widely
distributed cluster consisting of either three or four elements,
namely: a regulator, a sucrose phosphotransferase (PTS) trans-
porter, a sucrose hydrolase, and occasionally a fructokinase. Two
gene cluster alignments were generated: (i) a PTS alignment
representing similarities over the sucrose operon, bearing a PTS
transport system associated with a sucrose hydrolase; and (ii) an
ABC alignment representing similarities over the multiple sugar
metabolism cluster, bearing an ABC transport system usually
associated with a galactosidase. Sequence information is avail-
able in Table 3, which is published as supporting information on
the PNAS web site.
Phylogenetic Trees. Nucleotide and protein sequences were
aligned computationally by using the CLUSTALW algorithm (16).
The multiple alignment outputs were used for generating un-
rooted neighbor-joining phylogenetic trees by using MEGA2 (17).
In addition to a phylogenetic tree derived from 16S rRNA genes,
trees were generated for ABC transporters, PTS transporters,
transcription regulators, fructosidases, and fructokinases.
Gene Inactivation. Gene inactivation was conducted by site-
specific plasmid integration into the L. acidophilus chromosome
via homologous recombination (18). Internal fragments of the
msmE and bfrA genes were cloned into pORI28 by using
Escherichia coli as a host (19), and the constructs were subse-
quently purified and transformed into L. acidophilus NCFM.
The ability of the mutant strains to grow on a variety of
carbohydrate substrates was investigated by using growth curves.
Strains were grown on semisynthetic medium supplemented with
0.5% wt�vol carbohydrate.
Results
Computational Analysis of the msm Operon. Analysis of the msm
locus using four ORF-calling programs revealed the presence of
seven putative ORFs. Because most of the encoded proteins
were homologous to those of the msm operon present in S.
mutans (20), a similar gene nomenclature was used. The analysis
of the predicted ORFs suggested the presence of a transcrip-
tional regulator of the LacI repressor family, MsmR; a four-
component transport system of the ABC family, MsmEFGK;
and two enzymes involved in carbohydrate metabolism, namely
a fructosidase EC 3.2.1.26, BfrA; and a sucrose phosphorylase
EC 2.4.1.7, GtfA. A putative Shine–Dalgarno sequence 5�AG-
GAGG3� was found within 10 bp upstream of the msmE start
codon. A dyad symmetry analysis revealed the presence of two
stem–loop structures that could act as putative Rho-independent
transcriptional terminators: one between msmK and gtfA (be-
tween base pairs 6,986 and 7,014), free energy�13.6 kcal�mol�1,
and one 20 bp downstream of the last gene of the putative operon
(between base pairs 8,500 and 8,538), free energy �16.5
kcal�mol�1. The operon structure is shown in Fig. 1.
The regulator contained two distinct domains: a DNA-binding
domain at the N terminus with a predicted helix-turn-helix motif
(pfam00354), and a sugar-binding domain at the C terminus
(pfam00532). The transporter elements consisted of a periplas-
mic solute-binding protein (pfam01547), two membrane-
spanning permeases (pfam00528), and a cytoplasmic nucleotide-
binding protein (pfam 00005), characteristic of the different
subunits of a typical ABC transport system (21). A putative
anchoring motif LSLTG was present at the N terminus of the
substrate-binding protein. Each permease contained five trans-
membrane regions predicted computationally (22). Analyses of
ABC transporters in recently sequencedmicrobial genomes have
defined four characteristic sequence motifs (23, 24). The pre-
dicted MsmK protein included all four ABC conserved motifs,
namely: Walker A: GPSGCGKST (consensus GxxGxGKST or
[AG]xxxxGK[ST]); Walker B: IFLMDEPLSNLD (consensus
hhhhDEPT or DExxxxxD); ABC signature sequence: LSGG;
and Linton and Higgins motif: IAKLHQ (consensus hhhhH�,
with h, hydrophobic and �, charged residues). The putative
fructosidase showed high similarity to glycosyl hydrolases (pfam
00251). The putative sucrose phosphorylase shared 63% residue
identity with that of S. mutans.
Sugar Induction and Coexpression of Contiguous Genes. Transcrip-
tional analysis of the msm operon by using RT-PCR and RNA
slot blots showed that sucrose and both types of oligofructose
(GFn and Fn) were able to induce expression of msmE and bfrA
(Fig. 2A). In contrast, glucose and fructose did not induce
Fig. 1. Operon layout. The start and stop codons are shaded, the putative
ribosome binding site is boxed, and the cre-like elements are underlined.
Terminators are indicated by hairpin structures.
8958 � www.pnas.org�cgi�doi�10.1073�pnas.1332765100 Barrangou et al.
transcription of those genes, suggesting specificity for nonreadily
fermentable sugars and the presence of a regulation system
based on carbohydrate availability. In the presence of both FOS
and readily fermentable sugars, glucose repressed expression of
msmE, even if present at a lower concentration, whereas fructose
did not (Fig. 2B). Analysis of the transcripts induced by oligo-
fructose indicated that all genes within the operon are coex-
pressed (Fig. 6, which is published as supporting information on
the PNAS web site) in a manner consistent with the S. mutans
msm operon (25).
Mutant Phenotype Analysis. The ability of the bfrA (fructosidase)
andmsmE (ABC transporter) mutant strains to grow on a variety
of carbohydrates was monitored by both optical density at 600
nm and colony-forming units. The mutants retained the ability
to grow on glucose, fructose, sucrose, galactose, lactose, and
FOS-GFn, in a manner similar to that of the control strain (Fig.
7, which is published as supporting information on the PNAS
web site), a lacZmutant of the L. acidophilus parental strain also
generated by plasmid integration (18). This strain was chosen
because it also bears a copy of the plasmid used for gene
inactivation integrated in the genome. In contrast, both the bfrA
and msmE mutants halted growth on FOS-Fn prematurely (Fig.
3), likely on exhaustion of simple carbohydrate from the semi-
synthetic medium. After one passage, the msmE mutant dis-
played slower growth on FOS-Fn, whereas the bfrAmutant could
not grow (Fig. 3). Additionally, terminal cell counts from
overnight cultures grown on FOS-Fn were significantly lower for
the mutants, especially after one passage (Fig. 7).
Comparative Genomic Analyses and Locus Alignments. Comparative
genomic analysis of gene architecture between L. acidophilus, S.
mutans, Streptococcus pneumoniae, Bacillus subtilis, and Bacillus
halodurans revealed a high degree of synteny within the msm
cluster, except for the core sugar hydrolase (Fig. 4A). In contrast,
gene content was consistent, whereas gene order was not well
conserved for the sucrose operon (Fig. 4B). The lactic acid
bacteria exhibit a divergent sucrose operon, where the regulator
and hydrolase are transcribed opposite the transporter and the
fructokinase. In contrast, gene architecture was variable among
the proteobacteria.
Phylogenetic Trees. Phylogenetic trees were generated to investi-
gate whether there was a correlation between protein similarity,
gene architecture, and the phylogenic relationships of the se-
lected microorganisms. The phylogenetic relationships were
obtained from 16S ribosomal DNA alignment. All proteobac-
teria appeared distant from the lactic acid bacteria, and the
Clostridium species formed a well defined cluster between
Thermotoga maritima and the bacillales (Fig. 5A).
For the fructosidases, all enzymes obtained from the LAB
sucrose operons clustered extremely well together at the left end
of the tree, whereas there was apparent shuffling of the other
three groups (Fig. 5B). The paralogs of those fructosidases in S.
mutans, S. pneumoniae, and L. acidophilus clustered at the
opposite end of the tree. Interestingly, the L. acidophilus fruc-
tosidase was distant from the LAB sucrose hydrolases cluster
and showed strong homology to enzymes experimentally asso-
ciated with oligosaccharide hydrolysis, in organisms such as T.
maritima, Microbacterium laevaniformans, and B. subtilis.
Each component of the ABC transport system clustered
together (Fig. 5C), namelyMsmE,MsmF,MsmG, andMsmK for
substrate-binding membrane-spanning proteins and nucleotide-
binding unit, respectively. For MsmE, MsmF, and MsmG, three
consistent subclusters were obtained: (i) the two Bacillus species;
(ii) L. acidophilus, S. mutans, and S. pneumoniae from the
Fig. 2. Sugar induction and repression. (A) Transcriptional induction of the
msmE and bfrA genes, monitored by RT-PCR (Upper) and RNA slot blots
(Lower). Cells were grown on glucose (Glc), fructose (Fru), sucrose (Suc), FOS
GFn, and FOS Fn. Chromosomal DNA was used as a positive control for the
probe. (B) Transcriptional repression analysis of msmE and bfrA by variable
levels of Glc and Fru: 0.1% (5.5 mM), 0.5% (28 mM), and 1.0% (55 mM), in the
presence of 1% Fn. Cells were grown in the presence of Fn until OD600
approximated 0.5–0.6, glucose was added, and cells were propagated for an
additional 30 min.
Fig. 3. Growth curves. The two mutants, bfrA (Upper) and msmE (Lower),
were grown on semisynthetic medium supplemented with 0.5% wt�vol car-
bohydrate: fructose (F), GFn (E), Fn (�), after one passage on Fn (■). The lacZ
mutant grown on Fn was used as control (ƒ).
Barrangou et al. PNAS � July 22, 2003 � vol. 100 � no. 15 � 8959
M
IC
RO
BI
O
LO
G
Y
operons bearing a galactosidase; and (iii) L. acidophilus and S.
pneumoniae from the operons bearing a fructosidase.
For the PTS transporters, the clustering did not proceed
according to phylogeny, especially for lactic acid bacteria, which
formed two separate clusters (Fig. 5D). The two distant trans-
porters at the bottom of the tree are non-PTS sucrose trans-
porters of the major facilitator family of transporters, as sug-
gested by their initial annotation.
All regulators were repressors, with the exception of those
regulators of L. acidophilus, S. pneumoniae, and S. mutans
clustering at the bottom of the tree (Fig. 5E), which activate
transcription of operons bearing an ABC transport system
associated with a galactosidase (20). In contrast, the msm
regulators for both S. pneumoniae and L. acidophilus seemed to
be repressors similar to that of the sucrose operon (5E). The
helix-turn-helix DNA-binding motif of the regulator was very
well conserved among selected regulator