Research ArticleMicrobiome

Natural history of the infant gut microbiome and impact of antibiotic treatment on bacterial strain diversity and stability

See allHide authors and affiliations

Science Translational Medicine  15 Jun 2016:
Vol. 8, Issue 343, pp. 343ra81
DOI: 10.1126/scitranslmed.aad0917

Elucidating the effects of drugs on bugs

Despite widespread use of antibiotics in children, the effects of antibiotic exposure on the developing infant gut microbiome have remained underexplored. Here, Yassour et al. present a longitudinal study capturing how the gut microbiome responds to and recovers from antibiotic perturbations. Antibiotic-treated children had less stable and less diverse bacterial communities. Antibiotic resistance genes within the guts of these children peaked after antibiotic treatment but generally returned rapidly to baseline. Delivery mode (vaginal versus cesarean) also had strong long-term effects on microbial diversity. These data give insights into the consequences of early life factors such as birth mode and antibiotic treatment on the infant gut microbiome.


The gut microbial community is dynamic during the first 3 years of life, before stabilizing to an adult-like state. However, little is known about the impact of environmental factors on the developing human gut microbiome. We report a longitudinal study of the gut microbiome based on DNA sequence analysis of monthly stool samples and clinical information from 39 children, about half of whom received multiple courses of antibiotics during the first 3 years of life. Whereas the gut microbiome of most children born by vaginal delivery was dominated by Bacteroides species, the four children born by cesarean section and about 20% of vaginally born children lacked Bacteroides in the first 6 to 18 months of life. Longitudinal sampling, coupled with whole-genome shotgun sequencing, allowed detection of strain-level variation as well as the abundance of antibiotic resistance genes. The microbiota of antibiotic-treated children was less diverse in terms of both bacterial species and strains, with some species often dominated by single strains. In addition, we observed short-term composition changes between consecutive samples from children treated with antibiotics. Antibiotic resistance genes carried on microbial chromosomes showed a peak in abundance after antibiotic treatment followed by a sharp decline, whereas some genes carried on mobile elements persisted longer after antibiotic therapy ended. Our results highlight the value of high-density longitudinal sampling studies with high-resolution strain profiling for studying the establishment and response to perturbation of the infant gut microbiome.


A growing number of studies have highlighted the critical role played by commensal bacteria in human health. These studies have largely focused on characterizing healthy adults (1) and finding commonalities among patients suffering from diseases such as inflammatory bowel disease (2, 3), type 2 diabetes (4), obesity (59), metabolic disorders (1015), colorectal cancer (16, 17), liver cirrhosis (18, 19), rheumatoid arthritis (20), and others. Several longitudinal studies have focused on pediatric conditions, including malnutrition (21, 22), type 1 diabetes (23), and asthma (24). More recently, researchers have also begun to explore the impact of external factors such as antibiotic exposure, with studies in small samples of adults suggesting that antibiotic treatment decreases microbial diversity (2527); similar results have been obtained in mice (28).

Fewer studies have sought to characterize the natural history of the gut microbiome in children. Most of these studies have attempted to infer the natural history based on cross-sectional data sets comparing multiple children at a single time point. These cross-sectional observations have reported that birth mode (vaginal delivery versus cesarean section) affects gut microbiome composition in the first 6 months of life (2931) and that the gut microbiome of children matures by the age of 3 years (32). More recently, a longitudinal study analyzing four time points during the first year of life reported that birth mode and nutrition influence gut microbiome composition at the level of bacterial species (33). However, no longitudinal studies have analyzed the developing gut microbiome with dense sampling or detailed analysis at the level of strains within species. In addition, despite widespread use of antibiotics in children, the effect of antibiotic exposure on the developing infant gut microbiome remains unexplored. The high prevalence of antibiotic use and the concurrent increase in antibiotic-resistant bacteria, coupled with our growing appreciation of the microbiota’s role in human health, highlight a critical need to understand the short- and long-term effects of repeated antibiotic treatments on the gut microbiome.

We therefore undertook a natural history study of the infant gut microbiome with unique strengths arising from the combination of dense sampling, multiple clinical variables (including birth mode and antibiotic usage), and strain-level analysis. We collected and analyzed samples from 39 children with an average of 28 samples per child for the first 3 years of life. During this time period, 20 of the children received 9 to 15 antibiotic treatments, whereas the remaining 19 children never received antibiotics. We performed both 16S ribosomal RNA (rRNA) gene and whole-genome shotgun (WGS) sequencing to analyze microbial diversity at all taxonomic levels, including genus, species, and strain. In addition to gaining insight on how the gut microbiome develops when unperturbed by antibiotics, metagenomic sequencing of samples before and after antibiotic exposure allowed us to explore changes in the abundance of antibiotic resistance genes. We found decreased microbial diversity and increased short-term composition changes in the gut microbiomes of antibiotic-treated children. Furthermore, we observed an increased abundance of antibiotic resistance genes after treatment, along with concurrent increases in specific bacteria likely harboring these genes.


A longitudinal study tracks infants in their first 3 years of life

To study the development of the infant gut microbiome and the effect of multiple antibiotic treatments, we collected monthly stool samples from 39 Finnish children aged 2 to 36 months (average of 28 samples per child, total of 1069 samples). We collected multiple clinical metadata variables per child (see Materials and Methods) and focused on a subset of variables (mode of delivery, breast-feeding, and infant diet) that have been reported in previous studies to affect the infant gut microbiome (29, 30, 33). Of the 39 children tested, 19 received no antibiotics (Abx children) and the remaining 20 children received 9 to 15 antibiotic courses in their first 3 years of life (Abx+ children; Fig. 1A). All courses consisted of systemic antibiotics given orally, most commonly to treat otitis media (ear infections, 91% of courses) and respiratory infections (6%). To analyze the composition of the microbial communities in this cohort, we isolated DNA from stool samples and amplified and sequenced the V4 region of the 16S rRNA gene. Sequences were sorted into operational taxonomic units (OTUs) (see Materials and Methods), and the results were integrated to construct genus-level composition maps for all 1069 samples. We identified 142 genera across the samples, 36 of which were found at a relative abundance greater than 1% in at least 25 samples. We selected 240 samples for WGS sequencing (triangles in Fig. 1A) on the basis of two criteria: (i) samples were available from all children at ages 2, 12, 24, and 36 months, and (ii) at least four samples were available before and after selected antibiotic treatments (with minimal additional treatments in this time period). WGS sequencing allowed us to identify bacterial species within the genera (1 to 25 species per genus). In addition, we could often use the WGS data to infer the strain composition within those species (1 to 10 strains per species) by mapping the reads to the species reference sequences and analyzing the frequencies of multiple polymorphism sites across time for each child. Finally, we used the WGS data to determine the presence and to quantify the abundance of specific genes, including antibiotic resistance genes.

Fig. 1. Study design and common features of infant gut microbiota.

(A) Study design showing 1069 samples (gray circles and triangles) collected from 39 children (y axis) over 36 months (x axis), together with information regarding the time and type of antibiotic treatments (colored circles). 16S rRNA gene sequencing was performed for all samples (gray circles and triangles), and 240 samples were selected for metagenomic sequencing (triangles). (B) Common features of the developing infant gut microbiota. Graph shows relative abundance (y axis) of the most common bacterial families over time (x axis), averaged across all 39 children. Shaded regions indicate 95% confidence intervals. See fig. S1 for genus-level abundance plots. (C to E) Stream plots (60) of individual microbial trajectories of three children over time (x axis), where each genus is color-coded by its phylum. Shown are typical trajectories for a vaginally born child (C) (where species of the Bacteroides genus are present from an early age), a child born by cesarean section (D) (in which members of the Bacteroides genus are undetectable in the first 10 months), and an atypical trajectory for a vaginally born child (E) (which appears similar to the cesarean section signature).

Infant gut microbiome development shares common features across individuals

We first determined the change in microbial composition over time for each child (Fig. 1B). The microbial trajectory (namely, the succession of bacterial populations in the gut communities) showed multiple similarities across all infants, several of which have been observed in previous studies (21, 23). For example, at the family level, nearly all infants (87%) had significant levels of Enterobacteriaceae (average relative abundance, 25%), Bifidobacteriaceae (average, 15%), and Clostridiaceae (average, 8%) at age 2 months, with each family steadily decreasing to an average relative abundance of 1, 3, and 2.5%, respectively, by the age of 18 months. In contrast, members of the Lachnospiraceae and Ruminococcaceae families were barely present at 2 months (average relative abundance of 4 and 0%, respectively) and increased in the first 12 months before stabilizing at 12 to 20% of the microbial community (Fig. 1B). In family-level measurements, a single genus often accounted for most of their abundance (fig. S1). For example, members of the Bifidobacterium and Clostridium genera were present in most children in the first few months (average relative abundance of 16 and 7%, respectively) and decreased over time to nearly 0 by 18 months of age (fig. S1). The opposite pattern was observed for two notable beneficial commensal bacteria (34, 35). Akkermansia muciniphila (a mucin-degrading bacterium) and Faecalibacterium prausnitzii were absent at age 2 months but appeared in most children by months 24 and 12, stabilizing at 2 and 3% abundance, respectively (fig. S1).

We also found several clear differences from previous studies, particularly regarding the abundance of species from the Bacteroides and Bifidobacterium genera. Cross-sectional comparisons of breast-fed and formula-fed children have reported a positive association between the total abundance of various Bifidobacterium species and the length of breast-feeding (29). All of the children in our study were breast-fed for some period of time. On average, the breast-feeding period correlated with higher abundance of Bifidobacterium species (fig. S2); however, this longitudinal cohort of Finnish children did include some infants who had low abundance of these species even during the breast-feeding period (fig. S2). In addition, previous studies by Koenig et al. (36) [reviewed in (37)] have reported that the Bacteroides genus is absent before the introduction of solid food. However, many individuals in our cohort showed a significant Bacteroides species presence at the earliest time points, before the introduction of solid food (fig. S3). Specifically, the median relative abundance of the Bacteroides genus before introduction of solid food was 47%.

Finally, we examined the degree of consistency in the developing gut microbiome by querying whether the gut microbiome of a child at a given age is more similar to itself at an earlier time point or to unrelated children at the same age. For this purpose, we compared the microbial composition of all sample pairs from the same child by calculating the Jaccard index (see Materials and Methods). Simply stated, the Jaccard index measures the fraction of shared OTUs between two samples, such that higher values indicate greater similarity. We found that the infant gut microbiota developed continuously and the bacterial composition of a child at any given age was more similar to adjacent samples from the same child versus age-matched samples from unrelated children (fig. S4). However, adjacent samples of the same child in the first 6 months of age were not as similar as adjacent samples in older ages, indicating that the community developed most rapidly in the first 6 months of life (fig. S4).

A distinct microbial signature of low Bacteroides in the first 6 months is found in all cesarean section–born children and a subset of vaginally born children

Despite the commonalties among all infants, we observed one particularly marked difference in a significant portion of the cohort. Previous studies have reported that children born by cesarean section initially lack members of the Bacteroides genus in their gut microbiome, in contrast to vaginally born children (30, 38, 39). We did indeed observe that members of the Bacteroides genus were undetectable in the guts of children born by cesarean section (4 of 39) and that this genus was detected for the first time only between 6 and 18 months of age (dark blue, Fig. 1, C to E, and fig. S5A). However, we unexpectedly found that a substantial proportion of vaginally born children (7 of 35) also exhibited this “low-Bacteroides” signature (Fig. 1E and fig. S5A), an observation that has not been previously reported. The low-Bacteroides signature was not associated with any clinical variables surrounding the delivery of the child, including antibiotic treatment of the mothers, gestational age, duration of delivery, use of enemas before delivery, or time spent at the hospital. We denote the group of 11 children (4 cesarean section–born and 7 vaginally born with a cesarean section signature) as the low-Bacteroides group (see Materials and Methods for exact criteria).

Previous studies have also reported a decreased abundance of the Bifidobacterium genus in the gut microbiomes of children born by cesarean section (29, 39); however, we observed the opposite pattern. On average, children from the low-Bacteroides group had a higher abundance of Bifidobacterium species in the first 6 months of life compared to all other individuals (fig. S5B). It is possible that the Bifidobacterium species occupy the niche left open by the lack of Bacteroides. Functionally, Bacteroides are major contributors to the breakdown of human milk oligosaccharides (HMOs) (40). Using our metagenomic data, we quantified the relative contribution of each species to this function, revealing that Bifidobacterium were the main contributors to HMO breakdown in the low-Bacteroides group, whereas the Bacteroides species were the dominant contributors for all other children (fig. S5C).

Antibiotic-treated children have a less diverse gut microbiota

We next sought to investigate the effects of multiple antibiotic treatments in the first 3 years of life. Specifically, we set out to measure the diversity and richness of the microbiota, first at the level of species and then at the level of strains. We measured richness (α-diversity) by calculating the Chao1 metric (see Materials and Methods) for each of the 1069 samples. On average, the Abx children had a richer microbial community compared to the Abx+ children; however, the difference was relatively modest and was evident only after the first year of life (fig. S6A). Additionally, the low-Bacteroides signature was associated with a further decrease in α-diversity, which was evident by 6 months of age (fig. S6). Previous studies have suggested that children born by cesarean section have less diverse gut microbiomes (30, 38); however, we found decreased α-diversity in low-Bacteroides children irrespective of mode of delivery.

One virtue of having multiple longitudinal samples from each individual is the ability to infer the strain composition of the abundant species. Using the 240 metagenomes that we obtained at both regular intervals and surrounding antibiotic treatments, we evaluated diversity at the strain level using ConStrains (41). Briefly, this approach recruits metagenomic reads on a set of unique genes per species (42) and infers the strain composition and their relative abundance from detected single-nucleotide polymorphism sites (SNPs) (commonly 1 to 10 strains per species). At the strain level, the difference in diversity of the Abx+ children’s gut microbiomes became more evident. Within each species, we measured the strain diversity by calculating the diversity index using the probability of observing each strain within the species (see Materials and Methods) and defined a species as dominated by a single strain if this score was lower than 0.1. The microbiota of the Abx+ children had significantly more species dominated by a single strain compared to the Abx children (P = 1.38 × 10−9; Fig. 2, A to C).

Fig. 2. Diversity and strain similarities of the infant gut microbiota.

(A) Diversity index of the strain distribution within each species shown for all metagenomic samples. Samples are colored according to three groups: children who received antibiotics (red), children with low Bacteroides (blue), and children who received no antibiotics (green). (B and C) Strain distributions of two selected samples (y axis is relative abundance of each strain) and their calculated diversity indices. The selected samples were E032966, month 24 (B), and E006091, month 23 (C). (D) Partial phylogenetic trees based on the mutation distance between all strains of Bacteroides fragilis (left) and Bacteroides vulgatus (right). Strains are colored by the child in whom they were detected, and colored nodes represent the most recent common ancestor (MRCA) of the strains found in that child (scale bars of 1000 mutations are shown per tree; full trees appear in fig. S8). (E) Distributions of the mutation distance for all pairwise comparisons of B. fragilis (left) and B. vulgatus (right) strains, within (blue) or across (gray) individuals. (F) Distributions of mutation distances within individuals, colored as Antibiotics (green) or Antibiotics+ (red), with P values for the separation of these two distributions (Kolmogorov-Smirnov test) for B. fragilis (left) and B. vulgatus (right) strains.

Bacterial strains evolve within the gut (43) and are also introduced from the external environment (for example, diet and household animals) (36, 37, 44). To attempt to determine the origin of the dominant strains in the Abx+ children, we investigated the differences among strains in each individual. For each species, we measured the difference between all of its strains both within and across all children by calculating the number of common SNPs between pairs of strains. Using this mutation-distance matrix, we constructed a phylogenetic tree of all the strains of each species, where the mutation distances are represented by the branch lengths (Fig. 2D), and compared the distance of strains within and across all children (Fig. 2E). To quantify the difference among the various strains in a species within a given individual, we calculated the median branch length of the strains to their most recent common ancestor (MRCA; colored nodes, Fig. 2D). To quantify the difference between strains of unrelated individuals, we calculated the median branch length from the MRCA to the MRCA of all other individuals (Materials and Methods and fig. S7). Using these phylogenetic strain metrics, we observed two patterns in strain similarity measures. For some species, the strains within an individual were much more closely related to one another than they were to strains in other individuals. This pattern is consistent with the introduction of a single species, followed by evolution within the individual. Note that the single-colonization pattern may reflect multiple exposures but only a single colonization event. In these cases, children were likely exposed to various strains of this species, but only a single strain robustly colonized their gut. Species with this pattern (referred to as “single-colonization species”) included Escherichia coli, F. prausnitzii, B. fragilis, and Haemophilus parainfluenzae [Fig. 2, D and E (left), and figs. S7 and S8]. In contrast, for other species such as B. vulgatus, some strains within an individual were more closely related to strains in other individuals (Fig. 2, D and E, right). This pattern is referred to as “multiple-colonization species” and is consistent with the introduction and colonization of multiple distinct strains into an individual, potentially over time.

We hypothesized that antibiotic treatments would affect the diversity of strains in the gut. We therefore expected that repeated antibiotic treatments would affect the strain similarity of single-colonization species to a greater degree than multiple-colonization species. Focusing on the examples above, we examined the diversity of B. fragilis as a representative single-colonization species and B. vulgatus as a representative multiple-colonization species. For these analyses, we compared and stratified the results within an individual according to antibiotic treatment history. Consistent with our hypothesis, we found that B. fragilis strains were less similar within Abx+ children compared to the strains within Abx children (Fig. 2F, left); correspondingly, in the case of B. vulgatus, in which strains within individuals and strains in unrelated children were nearly inseparable, we could not differentiate the Abx children from the Abx+ children (Fig. 2F, right).

Finally, the sampling and sequencing capabilities of our study offered a unique opportunity to examine the differential abundance of various taxa between Abx and Abx+ children, at both the species and strain levels. Recent population-wide studies have identified members of the Clostridium clusters 4 and 14a as inducers of T regulatory immune cells (45), and we focused on this group of species as an illustrative example. We found that at the age of 3, the total abundance of species from these clusters differed significantly between Abx and Abx+ children (P = 0.037; fig. S8). This difference was largely due to Eubacterium rectale, one of the most abundant members of this group (P = 0.031, fig. S8). Furthermore, when we examined the E. rectale strain-level differences within each individual, we found that strains within Abx+ children were as different as strains from unrelated individuals (fig. S8). One potential explanation for these observations is that E. rectale is a single-colonization species for Abx children; however, E. rectale was likely introduced multiple times but did not persist in at least some Abx+ children.

Antibiotic-treated children have less stable communities

The abovementioned analyses considered the number of different taxa (species or strains) identified at each time point. We next examined how the microbial community changes over time (that is, the stability of the community). Taking advantage of the longitudinal nature of our study, we compared the microbial composition at all pairs of time points for each child by calculating the Jaccard index, in which higher values indicate similar microbial composition (Fig. 3, A to C, and fig. S9). We omitted the low-Bacteroides children from our analysis because the eventual appearance of Bacteroides greatly affected the Jaccard index (see *, Fig. 3B).

Fig. 3. Stability of the infant gut microbiota.

(A to C) Individual plots depicting the stability of the gut microbiome over time for three children (see fig. S9 for plots for all children). All sample pairs from the same child were compared using the Jaccard index and plotted as a function of their age. Child identifiers are colored as Abx (green) (A), low Bacteroides (blue) (B), or Abx+ (red) (C). (D) Median of stability was calculated for all consecutive samples and plotted for Abx+ and Abx children. Box boundaries are the 25th and 75th percentiles, and the median is highlighted. P = 3.3 × 10−5, Kolmogorov-Smirnov test. Inset shows the estimated variance of measurements for each group (mean ± SE). The median values for subjects presented in (A) and (C) are shown as green and red stars, respectively.

We observed that the Jaccard index of consecutive samples of Abx individuals was consistently high, indicating the overall stability of the community, with many OTUs shared between these time points (points on the diagonal in Fig. 3A). In contrast, we observed decreased short-term stability around the time of the antibiotic treatment (red lines in Fig. 3C). We defined the overall stability index of each child as the median of the Jaccard indices for consecutive time points. Using this stability index, we found that, in contrast to the Abx children, the Abx+ children had significantly less stable microbial communities (P = 3.3 × 10−5; Fig. 3D, fig. S9, and Materials and Methods). Furthermore, the variation in the stability indices across all Abx+ children was much larger compared to the Abx children, indicating variability in the stability response per child (Fig. 3D, inset).

Antibiotic resistance genes are detected after treatment

Next, we explored whether the use of antibiotics promotes the expansion of antibiotic resistance genes in the gut of these children, perhaps due to increased growth of bacteria harboring antibiotic resistance genes or increased mobilization of antibiotic resistance genes by other mobile elements (46). We determined the presence and abundance of antibiotic resistance genes by mapping the WGS sequence reads from all samples to a database of known antibiotic resistance genes. We focused on genes that are not normally present in the core genome of a species and whose presence, either chromosomally or episomally (either on a plasmid or on a transposon), confers resistance to a specific type of antibiotics (47).

For chromosomally encoded genes, we found a number of cases in which the abundance of the antibiotic resistance gene rose extremely rapidly during antibiotic treatment and then decreased swiftly after antibiotic withdrawal (Fig. 4A). In each case, we also identified a bacterial species that showed a strongly correlated change in abundance, which was likely to harbor the resistance gene (Fig. 4B). A notable example followed penicillin treatment in subject E021822 at 5 months (Fig. 4A, left), where we detected an increase in a Klebsiella pneumoniae β-lactamase resistance gene. In this example, the sharp increase of the β-lactamase gene correlated perfectly with an increase in the relative abundance of K. pneumoniae (from 0 to 22%; Fig. 4B, left); both the gene and species decreased to nearly 0 abundance in the sample from the following month. Another example is the increase of a tolC antibiotic efflux gene (from 0 to 100 rpkm) in subject E021940 at age 6 months (Fig. 4A, middle) that is tightly correlated with an increase of E. coli, which is annotated as the host of this gene (from 4 to 24%; Fig. 4B, middle). A third example (subject E020924 at 8 months) involves the tet32 resistance gene, which is annotated as belonging to an unclassified Clostridiaceae bacterium (K10) (47). We observed that the relative abundance of Ruminococcus gnavus (a Clostridiaceae species) increased to a remarkable 97% at that time point, consistent with the tet32 gene being associated with this species (Fig. 4B, right).

Fig. 4. Antibiotic resistance gene profiles.

(A) Abundance of antibiotic resistance gene products [mostly chromosomally encoded, reads per kilobase of transcript per million mapped reads (rpkm) values] in three children over time, together with the timing of individual antibiotic courses (colored dots; see fig. S10 for plots for all children). (B) Relative abundance of species that most correlated with the resistance gene profiles of (A). (C) Examples as in (A) of antibiotic resistance genes that are found on mobile elements and that are present in the gut for longer time periods. In (A) and (C), the number and order of antibiotic courses are shown, with each antibiotic class indicated by color.

For some antibiotic resistance genes that are encoded episomally, we observed a somewhat different pattern. Specifically, their abundance increased upon antibiotic treatment but did not decrease sharply upon withdrawal of the antibiotic. Instead, the presence of the antibiotic resistance genes continued for much longer periods of time (Fig. 4C). This might be explained by the fact that episomally encoded elements can be broadly distributed across multiple species, whereas chromosomally encoded elements are nonmobile and could impose a significant fitness cost on the single host species.

Some children (11 of 39) harbored antibiotic resistance genes as early as 2 months of age, before any antibiotic treatments (yellow plots, fig. S10). The reason for the increased abundance of these genes is unclear. Possible explanations could include (i) response to antibiotics passed on to the child from the environment (for example, mothers’ breast milk), (ii) a response to a natural antibiotic molecule produced by other gut bacteria, or (iii) preexisting presence of the antibiotic resistance gene in the genomes of some early inhabitants of the gut (48). The last explanation would be consistent with the spread of antibiotic resistance through natural populations due to widespread use of antibiotics. Additional studies will be needed to determine the explanation for the early presence of antibiotic resistance genes.


We followed the development of the gut microbiome in 39 infants by densely sampling stool over a 3-year period and investigating correlations with factors such as mode of birth (vaginal versus cesarean), infant nutrition (breast milk versus formula), and exposure to antibiotics (33). The dense sampling per child, together with deep metagenomic sequencing that facilitated the recognition of distinct strains within species, allowed us to characterize the developing infant gut microbiome and the effect of repeated antibiotic exposure at unprecedented resolution.

Infants in the first 6 months of life showed one of two microbial signatures based on the abundance of Bacteroides species. There have been contradictory reports in the literature regarding the typical microbial profile in infants. Multiple studies have shown that children born by cesarean section have a low-Bacteroides signature, whereas children born vaginally have higher levels of Bacteroides (30, 38, 39). Furthermore, one study (36) and a review (37) have described the low-Bacteroides signature as the typical profile for all children. Our study demonstrates that the low-Bacteroides signature is not confined solely to children born by cesarean section. Whereas the low-Bacteroides signature was seen in all infants in our cohort born by cesarean section, it was also observed in 7 of 35 (20%) children delivered vaginally. The differences cannot be explained by technical variation because the signal is extremely strong and long lasting: the abundance of Bacteroides species is exceptionally high (average 45%), and we observe it across multiple time points per child. We searched extensively for clinical variables that might explain the low-Bacteroides signature in vaginally born children (including exposure of mothers, length of delivery, and mother’s use of enema before delivery), but with only seven children, we were unable to identify any variables that were statistically significantly correlated with the difference. Larger sample sizes may reveal the explanatory variable. Previous studies have shown that cesarean section delivery is associated with decreased microbial diversity, even at 2 years of age (38). Our results critically expand this finding by demonstrating that the decrease in microbial diversity is seen across children with low-Bacteroides signature regardless of their mode of birth. The decreased microbial diversity persisted, even at 36 months, when Bacteroides had become highly abundant in low-Bacteroides children. The low-Bacteroides signature merits further study to identify its cause and to understand the implications of the associated decrease in microbial diversity.

Using our extensive metagenomic data, we identified organisms at the species level and often at the strain level, offering us the opportunity to study community diversity at a much higher resolution. The use of antibiotics was associated with a decreased microbial diversity at 3 years of age, especially at the strain level. It has been suggested that decreased community diversity can limit the education of the immune system, including its ability to recognize commensal bacteria, and can result in less robust microbial communities [reviewed in (49)].

Our results also shed light on the resilience of the gut microbiome, which relates to the ability of an ecosystem to return to equilibrium after perturbations (50). In ecology, repeated perturbations to an ecosystem are recognized as especially harmful when they recur before the ecosystem has had time to recover from an initial insult (51). In adults, the gut microbiome has been reported to recover from a single antibiotic exposure within about 2 weeks, but multiple treatments can cause this time frame to expand substantially (26, 50). We found that antibiotic-treated children had less stable gut microbiomes compared to untreated children. Although the gut microbiome largely appeared to return to baseline within 1 month on the basis of our taxonomic classification, we cannot exclude longer-term effects that are not captured by our measures. Subtherapeutic antibiotic doses lead to increased body mass in both farm animals (52) and laboratory mice (53). However, despite the repeated antibiotic treatment in children examined here, we did not observe any increase in body weight in the Abx+ children (the median body mass index at 36 months for the Abx+ and Abx children was 16.2 and 16.0 kg/m2, respectively; P = 0.69). More detailed analysis of both gut microbiome and host will be required to understand long-term effects of antibiotic exposure.

Antibiotic treatment corresponded with short-term microbial community instability; however, we did not observe reproducible bacterial composition changes after antibiotic treatments, even when considering each antibiotic class separately. Because responses to antibiotics may occur over just a few weeks, it is possible that denser sampling would reveal more common short-term responses to antibiotic treatment. Alternatively, it is possible that the variation in responses reflects unique factors in each individual, based on the previous microbial state, host genetics, and/or overall disease state. Another study published in this issue (54) compared a similar number of children with and without antibiotic treatments, using a population born in the United States that had a higher rate of cesarean deliveries than our cohort. The authors identified a less mature gut microbiome and decreased stability in the intestine of antibiotic-treated children; however, mode of delivery had a stronger effect than repeated antibiotic treatments (54).

One important consequence of antibiotic usage is the spread of antibiotic resistance genes, which is a serious public health issue. After antibiotic exposures, we observed two alternative patterns for antibiotic resistance gene abundance depending on their genomic context: (i) genes carried on the microbial chromosome tended to show a strong peak in abundance after treatment, followed by a sharp decline, and (ii) genes on mobile elements tended to persist much longer after antibiotic withdrawal. Where these genes originated remains unclear; we detected the presence of antibiotic resistance genes in 2-month-old infants before any antibiotic exposure. Previous studies have also reported that some newborn infants harbor antibiotic resistance genes (55, 56), potentially acquired from their mothers (55). Moreover, antibiotic resistance genes have been found in the guts of Amerindian adults (48) as well as a South American mummy (57), even in the absence of any known exposure to commercial antibiotics. Regardless of their origin, our data suggest that the genomic context of the antibiotic resistance gene may influence its spread and persistence upon exposure to antibiotics. Given the growing prevalence of antibiotic treatments and the frequency at which children exchange microbiota, it is particularly important to understand the effects of childhood antibiotic exposure on the presence and spread of antibiotic resistance genes in the community.

The pediatric gut microbiome field is still new, with relatively few large studies that include dense longitudinal sampling and multiple clinical covariates. Our work contributes unexpected features to the field, including the highly individualized microbial profiles of infants and the variable responses of those microbial communities to antibiotics. The study design did limit some analyses, including evaluation of the individual effects of each antibiotic course, because the high frequency of treatments did not allow for an independent analysis of each course. In addition, collecting samples before 2 months of age would have provided a fuller picture of early microbiome development. Follow-up studies examining events closer to delivery will be critical for understanding the origin of these individualized microbial profiles as well as for determining potential sources of antibiotic resistance genes. Fortunately, improved methods for sequencing and analysis make it increasingly feasible to elucidate the natural history of the infant gut microbial community, including the long-term effects of important perturbations such as antibiotics and cesarean delivery.


Study design

The primary objective of this study was to use dense longitudinal sampling to monitor changes in the infant gut microbial community over time. Specific variables were examined including route of delivery and use of antibiotics. Newborn infants were recruited to the birth cohort as part of the international DIABIMMUNE study ( Recruitment took place in Finland, between September 2008 and May 2010. Inclusion criteria for this study included receiving either none or at least 9 antibiotic courses in the first 3 years of life. The participating children were monitored prospectively for infections, use of antibiotics, and other life events. Data on breast-feeding and introduction of complementary foods were registered in a study booklet and interview at each study visit (3, 6, 12, 18, 24, and 36 months). Subject metadata is available in table S1. The DIABIMMUNE study was conducted according to the guidelines laid down in the Declaration of Helsinki, and all procedures involving human subjects were approved by the local ethical committee. The parents gave their written informed voluntary consent before sample collection. Per-subject analyses were performed blinded to the antibiotic treatment status of each subject. All outliers were included in data analysis.

Stool sample collection and DNA extraction

Stool samples were collected by the participants’ parents and stored in the household freezer (−20°C) until the next scheduled visit to the local study center; samples were then shipped on dry ice to the DIABIMMUNE Core Laboratory. Samples were stored at −80°C until shipping to the Broad Institute for DNA extraction. DNA extractions from stool were carried out using the QIAamp DNA Stool Mini Kit (Qiagen).

Sequencing and analysis of the 16S rRNA gene and WGS sequencing

16S rRNA gene sequencing was performed essentially as previously described (3). Taxonomy was assigned using version 1.8.0 of Qiime (58) and the Greengenes reference database of OTUs (59). A mean sequence depth of 48,131 per sample was obtained, and samples with less than 3000 sequences were excluded from analysis.

We selected 240 samples for WGS sequencing (also referred to as metagenomic sequencing) on the basis of two criteria: (i) samples from all children at ages 2, 12, 24, and 36 months, and (ii) at least four samples before and after selected antibiotic treatments (with minimal additional treatments in this time period).

Metagenome library construction

Metagenomic DNA samples were quantified by Quant-iT PicoGreen dsDNA Assay (Life Technologies) and normalized to a concentration of 50 pg/μl. Illumina sequencing libraries were prepared from 100 to 250 pg of DNA using the Nextera XT DNA Library Preparation Kit (Illumina) according to the manufacturer’s recommended protocol, with reaction volumes scaled accordingly. Batches of 24, 48, or 96 libraries were pooled by transferring equal volumes of each library using a Labcyte Echo 550 liquid handler. Insert sizes and concentrations for each pooled library were determined using an Agilent Bioanalyzer DNA 1000 kit (Agilent Technologies).

Classification of the low-Bacteroides children based on 16S rRNA gene sequencing data

The median relative abundance of the Bacteroides genus was calculated per child for samples collected in the first 6 months of age. Children with median abundance of less than 0.1% were classified as low-Bacteroides children.

Density plots

Multiple density plots appear throughout the paper. Similar to a histogram, the density plots shown here display the distribution of the observed data; unlike a histogram, the density function reflects the estimated underlying continuous probability from which the observed data have been sampled (that is, the area under the curve sums to 1).

Inferring strain profiles using ConStrains

WGS samples were additionally analyzed using ConStrains ( (41), which conducts within-species strain haplotyping by clustering SNP patterns detected from mapping reads to species pangenomes across samples. Using the longitudinal nature of our cohort, we analyzed the strain composition for all samples of the same child simultaneously, allowing us to follow within-subject strain profile evolution. ConStrains requires 10-fold coverage of a given pangenome to reliably identify strains within its corresponding species.

We used the strain-level abundance profiles to compute species-specific haplotype diversity scores for each sample, defined as H = 1 − ∑pi2, where pi denotes the abundance of strain i. This measure was motivated by the concept of heterozygosity in population genetics and is bounded by [0,1]. A higher value is consistent with a nearly uniform distribution of all strains within the species, and values closer to 0 indicate a single dominant strain.

Measuring antibiotic resistance genes

To detect and quantify the abundance of the antibiotic resistance genes in our WGS data, we used ShortBRED ( Briefly, given a set of antibiotic resistance protein sequences, ShortBRED clusters them into similar families based on their sequence, extracts a set of distinctive strings (“markers”) per family, and then searches for these markers in metagenomic data. We did not take into consideration genes that are normally present in the core genome of the species and in which point mutations can give rise to antibiotic resistance, as we need very high read coverage to identify these mutations. Instead, we focused on genes whose presence is sufficient to confer resistance. Specifically, we used the sequences of 3060 proteins from The Comprehensive Antibiotic Resistance Database (47) (

Statistical analysis

Statistical tests were performed as described in figure legends or in the detailed analytical methods described above.


Materials and Methods

Fig. S1. Average relative abundance of dominant genera in all 39 children.

Fig. S2. Bifidobacterium abundance together with early feeding data.

Fig. S3. Individual profiles of Bacteroides abundance together with solid food consumption.

Fig. S4. Consistency of the infant gut microbiome.

Fig. S5. Microbial trajectories for all children in the study.

Fig. S6. Richness of the infant gut microbiome.

Fig. S7. Strain similarity patterns of abundant species.

Fig. S8. Strain diversity.

Fig. S9. Stability of the infant gut microbiome.

Fig. S10. Abundance profiles for antibiotic resistance genes.

Table S1. Clinical variables used in this study including birth mode, early feeding history, and antibiotic treatments.

List of investigators in the DIABIMMUNE Study Group.


  1. Acknowledgments: We thank T. Poon and S. Steelman (Broad Institute) for help in sequence production and sample management; K. Koski and M. Koski (University of Helsinki) for the coordination and database work in the DIABIMMUNE study; C. Luo (Broad Institute) for help in the strain analysis; L. David (Duke University) for scripts; R. Bhattacharyya, J. Livny, and S. Schwarts (Broad Institute) for helpful discussions; and N. Nedelsky (Massachusetts General Hospital) for editorial help and figure generation. Funding: M.Y. was supported by the Klarman Family Foundation. E.S.L. was supported by National Human Genome Research Institute grant 2U54HG003067-10. The DIABIMMUNE study (M.K.) was supported by the European Union Seventh Framework Programme FP7/2007-2013 under grant agreement no. 202063 and the Academy of Finland Centre of Excellence in Molecular Systems Immunology and Physiology Research grant decision number 250114, 2012–2017. R.J.X. was supported by funding from JDRF; NIH grants U54 DK102557, R01 DK092405, and P30 DK043351; the Leona M. and Harry B. Helmsley Charitable Trust; and the Crohn’s and Colitis Foundation of America. Author contributions: M.Y. performed all data analysis. M.Y., H.V., E.S.L., and R.J.X. assembled and wrote the paper. H.S., A.-M.H., T.H., S.J.R., and M.K. designed the cohort study and collected clinical samples. M.Y. and D.G. designed DNA sequencing experiments and sample management pipelines. M.Y., T.V., E.A.F., C.H., H.V., D.G., and E.S.L. led method and research development. E.S.L., M.K., and R.J.X. served as principal investigators. Competing interests: M.K. is a co-founder and board member of Vactech Oy. The other authors declare that they have no competing interests. Data and materials availability: The National Center for Biotechnology Information BioProject ID for these data is PRJNA290381.
View Abstract

Stay Connected to Science Translational Medicine

Navigate This Article