Absence of sperm RNA elements correlates with idiopathic male infertility

See allHide authors and affiliations

Science Translational Medicine  08 Jul 2015:
Vol. 7, Issue 295, pp. 295re6
DOI: 10.1126/scitranslmed.aab1287


Semen parameters are typically used to diagnose male infertility and specify clinical interventions. In idiopathic infertile couples, an unknown male factor could be the cause of infertility even when the semen parameters are normal. Next-generation sequencing of spermatozoal RNAs can provide an objective measure of the paternal contribution and may help guide the care of these couples. We assessed spermatozoal RNAs from 96 couples presenting with idiopathic infertility and identified the final reproductive outcome and sperm RNA elements (SREs) reflective of fecundity status. The absence of required SREs reduced the probability of achieving live birth by timed intercourse or intrauterine insemination from 73 to 27%. However, the absence of these same SREs does not appear to be critical when using assisted reproductive technologies such as in vitro fertilization with or without intracytoplasmic sperm injection. About 30% of the idiopathic infertile couples presented an incomplete set of required SREs, suggesting a male component as the cause of their infertility. Conversely, analysis of couples that failed to achieve a live birth despite presenting with a complete set of SREs suggested that a female factor may have been involved, and this was confirmed by their diagnosis. The data in this study suggest that SRE analysis has the potential to predict the individual success rate of different fertility treatments and reduce the time to achieve live birth.


About 13% of the general reproductive age population have fertility problems (1). The American Society for Reproductive Medicine estimates that male and female factors contribute about equally to this condition, with about one-quarter likely a combination of factors from both partners. After 12 months of unprotected intercourse without pregnancy, affected couples typically begin to seek care and explore the possibility of fertility treatments (2).

More than 1% of the children born in the United States today are conceived using assisted reproductive technologies (ARTs) (3). Typically, to establish the appropriate clinical treatment and minimize the risk of failure, an extensive evaluation of the female, and to a lesser extent the male, is undertaken. If no severe male or female factors are detected, fertility treatments such as timed intercourse (TIC) or intrauterine insemination (IUI) are recommended in combination with ovarian stimulation. After three or four unsuccessful IUI cycles or if a severe male or female factor is detected, in vitro fertilization (IVF) with or without intracytoplasmic sperm injection (ICSI) is typically suggested.

Initial male factor assessment includes a review of reproductive history (time of subfertility, existence of previous pregnancies, and sexual function), family history (consanguinity and infertility history), relevant diseases (diabetes and mumps among others), and exposure to factors that negatively affect fertility (drugs, life-style, and occupation) along with a comprehensive physical examination. The male contribution is further evaluated by semen analysis, with intra-individual variation gauged through the results of two semen analyses separated by a period of up to 1 month (2). Assessment primarily relies on a defined series of semen parameters that include volume, sperm concentration, sperm motility, and sperm morphology. Other specific measures that may complement the workup include DNA fragmentation, the presence of antisperm antibodies, endocrine status, and genetic and cytogenetic markers such as AZFa or AZFb Y chromosome microdeletions associated with azoospermia. Although the evaluation of general semen parameters like sperm count, motility, and morphology may be useful in the diagnosis of obvious cases of male infertility where specific etiologic factors may be apparent, no single parameter or set of semen parameters are highly predictive of male fertility status within the general population (4). Current clinical practice focuses on whether there are sufficient spermatozoa with satisfactory motility and morphology to reach and likely fertilize the oocyte. Their utility in selecting the least invasive fertility treatment for idiopathic infertile couples appears limited (5).

Spermatozoa are not just a vehicle that delivers the male genomic contribution to the oocyte. Upon fertilization, the spermatozoon provides a complete, highly structured, and epigenetically marked genome that, together with a defined complement of RNAs and proteins, plays a distinct role in early embryonic development (6, 7). Although several studies have explored the effect of genetic variants such as single-nucleotide polymorphisms (SNPs) (8), copy number variants (9), differential genome packaging (10), differential methylation (11), proteomic changes (12), and differential sperm RNAs (13, 14) in male infertility, comparatively few have examined their effect within the context of the reproductive clinic (1519).

Characterization of the RNAs retained in sperm by next-generation sequencing (NGS) has recently been reported (2022). In contrast with earlier array-based approaches, RNA-seq has revealed a rich and complex population of unique coding and noncoding transcripts such as sperm-specific isoforms, intronic retained and otherwise unannotated elements, and long and small noncoding RNAs (2022). The large number of unique sperm transcripts is suggestive of regulatory roles (20, 22) influencing fertilization, early embryogenesis, and the phenotype of the offspring (20, 23). The application of spermatozoal microarray-based approaches to predict the outcome of different fertility treatments has met with varying degrees of success (17, 18). The intricacies of spermatozoal RNAs as revealed by NGS analysis (22) suggest that this technology is much better suited to the task. The objective of this initial study was to evaluate the diagnostic potential of NGS as a prognostic assay of spermatozoal RNAs that can predict the birth outcome after different fertility treatments.


Identifying sperm RNA elements required for natural conception

The ability of spermatozoal RNAs to predict a live birth (LB) outcome for various fertility treatments was assessed within the context of the idiopathic infertile couple to ascertain whether the underlying cause could be attributed to a male factor. As summarized in Table 1A, we observed no significant differences between the choice of treatment modality as a function of the different patient variables including age or any of the semen parameters, consistent with idiopathic infertility [one-tailed analysis of variance (ANOVA) or Kruskal-Wallis test, P > 0.05]. Female age was significantly higher in couples that did not achieve pregnancy (Table 1B; two-tailed t test, P = 0.024), and this could be attributed to unsuccessful IVF/ICSI (Table 1C; two-tailed t test, P = 0.004). We identified a set of sperm RNA elements (SREs) required for LB by natural conception within the positive control group I (LB by TIC during the first spermatogenic cycle and first attempt). Of the 278,605 SREs surveyed, only elements that ranked above the 99th percentile and were essentially at equivalent levels across control group I samples [no outliers outside interquartile range, IQR ≥ 1.5X (Q3Q1)] were defined as SREs required for natural conception. A total of 648 elements met these stringent criteria to be classified as required SREs (above the 99th percentile rank, present at a constant level in the control group). Nine of these 648 SREs corresponded to intergenic regions, 12 corresponded to sperm-specific intronic elements, and 42 were within 24 different noncoding RNAs, all of which are likely regulatory. Most (585) were within exonic regions of 262 different genes, 40% of which were ontologically classified as associated with spermatogenesis, sperm physiology, fertility, and early embryogenesis before implantation (Fig. 1).

Table 1. Characteristics of the study population.

The table details the distribution and characteristics of the study population in relation to fertility treatment used and procedural outcome [LB (live birth) versus NLB (no live birth)]. Group I subjects achieved an LB pregnancy in their first attempt using TIC during the first spermatogenic cycle after semen assessment (90-day cycle) and were considered as a natural conception. Samples from test set II include different subgroups based on treatment: (i) IUI or TIC delayed past the first 90-day cycle, (ii) ART preceded by unsuccessful IUI or TIC, and (iii) ART. The independent set of samples (set III) was composed of two subgroups: (i) samples from an independent fertility clinic, and (ii) patients who never achieved LB and in whom a female factor was subsequently diagnosed. Sample characteristics include male and female ages and semen parameters comprising total million sperm cells per sample, sperm motility (%), total motile sperm per sample (TMC), sperm morphology [% of normal forms (NFs)], and sperm DNA fragmentation [DNA fragmentation index (DFI)]. (A) The type of fertility treatment used did not correlate with any individual or sperm sample parameter evaluated. (B) When all patients were considered as a group, female age showed a negative correlation with LB (two-tailed t test, P = 0.024). (C) Female age was significantly higher in patients who were unsuccessful when treated by ART (two-tailed t test, P = 0.004) but not in patients who were unsuccessful when treated by TIC/IUI.

View this table:
Fig. 1. Genomic localization (exon, intron, intergenic, and noncoding RNAs) and function of the 648 required SREs.

Most of the required SREs are located in exons of annotated genes (585 of 648; 90.3%), and the remainder are in intronic regions (12 of 648; 1.9%), intergenic regions (9 of 648; 1.4%), or match to noncoding transcriptional elements including small nuclear RNAs, microRNAs, and long noncoding RNAs (42 of 648; 6.5%) with potential regulatory function. About 40% of the genes that contain one or more SREs have a known role in spermatogenesis, sperm physiology (sperm energy production or acrosome reaction), fertilization, and/or early embryogenesis. Additionally, 20% have a known role in cellular process such as transcription regulation, protein transport, ubiquitin-like conjugation pathway, and lipid metabolism. The potential function of the remaining transcripts has yet to be defined.

Ability of SREs to predict fertility treatment outcome

To discern whether SREs were indicative of fertility treatment outcome, a set of 56 samples (group II) from couples that underwent TIC (after the first cycle) or immediately proceeded to IUI or ART was evaluated with respect to the abundance of required SREs defined from group I. As shown in Fig. 2A (left and middle panels), all 648 required SREs were present in the control group of 7 patients and in 37 of the 56 samples within group II (indicated by a color gradient from yellow to green representing the 60th to >90th percentile of abundance). As summarized in Fig. 2B, the samples presenting with all SREs have a 72% (16 of 22) rate of success in achieving an LB within the first two treatment cycles (6 months). In comparison, as shown in Fig. 2A (right panel), 19 group II samples have at least one SRE absent as indicated by the zero percentile–ranked red rectangle (fig. S1). Although the proportion of male or female factors underlying idiopathic infertility remains to be established, the proportion of patients who were lacking some of the SREs is similar to the expected rate of idiopathic infertility (24, 25). No correlation between the number of SREs absent and semen parameters and age of partners was observed (fig. S2).

Fig. 2. Distribution of the 648 required SREs.

The percentile rank distribution of the SREs in control (group I) and test set [group II: (i) exclusively use TIC or IUI, (ii) unsuccessful IUI followed by ART, and (iii) directly use ART] is presented. The percentile rank of each element is indicated in a gradient color scale. Red indicates the complete absence of an element (zero percentile rank). In contrast, elements that are present range from yellow for the 60th abundance percentile to green that corresponds to >90th percentile of abundance. (A) (Left) Seven TIC individuals who were used to identify the SREs because they were considered to have successful natural conception and presented with all 648 SREs. (Middle) Thirty-seven samples from group II (25 group II-i, 5 group II-ii, and 7 group II-iii) with the complete set of SREs. (Right) Nineteen samples from group II (4 group II-i, 8 group II-ii, and 7 group II-iii) with at least one missing SRE. The first 33 sperm elements were absent in at least one sample from the test set (right). The remaining SREs (34 to 648) were present in all samples from groups I and II, showing that the vast majority of SREs are uniformly abundant in all samples surveyed. (B) Couples with all SREs using TIC/IUI or ART have a success rate of 72% (16 of 22) during the first two treatment cycles (6 months) after sperm RNA evaluation.

Samples with all SREs present have similar high rates of LB for both TIC/IUI and ART, 73% (22 of 30) and 75% (9 of 12), respectively (Fig. 3A). However, the absence of some of the SREs reduced the LB rate by TIC/IUI from 73 to 27% (3 of 11), whereas the LB rate remained similar for ART at 78% (11 of 14; Fig. 3B). As observed in Fig. 3C, patients with all SREs were more likely to achieve LB by TIC/IUI as compared to those with one or more SRE(s) absent (two-tailed Fisher’s exact test, P = 0.012). These significant differences were supported by a power of 0.7 and α error of 0.029. In comparison, we did not observe any difference in the number of absent SREs when we compared LB and NLB groups after ART treatment (two-tailed Mann-Whitney test, P = 0.783). This is consistent with the view that ART may be able to rescue some otherwise impaired sperm functions such as transit to the oocyte and/or fertilization, depending on the functions of the genes corresponding to the missing SREs. Notably, six of the group II couples failed to achieve an LB even by ART. The average female age in this group was significantly higher (P = 0.004, two-tailed t test; Table 1C), suggesting a potential age-related female factor, although only three of these six subjects were over 35 years of age. Among the couples that failed to achieve LB by ART with partners ≤35 years of age, two did not have a complete set of SREs. The missing SREs were within different genes including NDRG1 (stress response), TESK1 (kinase), DBN1 (stabilizes gap junctions), and CAMTA2 (calcium-dependent transcription factor), which are associated with embryogenesis or implantation.

Fig. 3. Analysis of treatment outcome as a function of required SREs.

(A). Most of the 37 group II couples with all SREs present underwent TIC/IUI, achieving an LB rate of 73% (22 of 30). The remaining samples reflecting patient preference along with the previously unsuccessful TIC/IUI cases were treated by ART, achieving a 75% (9 of 12) LB rate. (B) RNA analysis from samples with at least one SRE absent. Note that only 3 of the 11 TIC/IUI samples with an absent SRE achieved LB. The success ratio of LB using ART is similar to the ratio observed in samples with all SREs. (C) The percentage of LB using a noninvasive treatment for couples presenting with the complete set of SREs is higher compared to those with at least one SRE absent (two-tailed Fisher’s exact test, P = 0.012).

Testing SREs in samples from independent clinics and in couples with a known female factor

The results from an independent group of test samples (group III) are summarized in Table 2. Group III-i was composed of five couples from an independent fertility clinic. The presence of all SREs was confirmed, and in all cases, LB was achieved. In two cases, LB was achieved spontaneously or by IUI, whereas the remaining three cases directly used ART. The presence of all SREs in the samples suggests that SREs may provide a measure of male fecundity across clinics. Group III-ii was composed of four samples in which an unknown female factor was suggested. This was supported by the observation that in three of the couples (Table 2, samples 6 to 8), a gestational carrier (GC) yielded successful results after a series of failed fertility treatments. A causative female factor was suggested in sample 9 (Table 2, ii) by the presentation of stage 2 endometriosis. All SREs were present in three (samples 7 to 9) of the four samples in this set (samples 6 to 9). Two SREs were lacking in sample 6, suggesting that they were rescued by ART. The spectrum of test set analyses suggests the utility of sperm RNA as a marker of male fecundity.

Table 2. Required SREs in the test group of samples (group III).

The 648 RNA elements describing fertile sperm were tested in nine samples. (i) All samples that were obtained from an independent fertility clinic and achieved LB presented with all required SREs. LB was achieved spontaneously or by IUI for samples 4 and 5, whereas the remaining three cases (1 to 3) directly used ART. (ii) Samples with known female factor. A single instance (sample 6) shows the absence of two SREs as well as a known female factor. It is possible that with a GC (gestational carrier), the pregnancy was rescued by ART despite the absence of these two SREs.

View this table:


Except for cases where examination of sperm reveals gross deficiencies in count, motility, or morphology, it is clear that current standard tests have a limited capacity to discern male factor infertility and thereby be predictive of fertility treatment success for couples presenting with idiopathic infertility (26). This observation emphasizes the need to develop alternative strategies such as NGS. In comparison to analysis of somatic cells, RNA-seq data from sperm are characterized by the variability of absolute transcript abundance between samples caused by the physiological fragmentation of spermatozoal RNAs during the extrusion of the cytoplasm as each spermatozoon is formed. However, this did not affect the detection of 648 comparatively small exon-sized SREs that showed no rank variability in 32 of the 35 TIC/IUI cases that achieved LB. It appears that the SREs that were absent in the three LB TIC/IUI cases were not diagnostic and perhaps could be removed from consideration. Their validation in a larger population will clarify which are essential for diagnosis and may contribute to the birth of a healthy child.

Standard clinical practice often uses TIC or IUI as the initial treatment for couples with idiopathic infertility. It is not until after two to four IUI cycles have failed that ART is recommended. The results shown in this study suggest that with all 648 SREs, couples presenting for reproductive counseling have a high success rate (73%) using TIC/IUI compared to those lacking some SREs (27%; two-tailed Fisher’s tests, P = 0.012). If validated and implemented as part of the clinical assessment, the absence of an SRE may suggest the earlier use of ART that could reduce the time to achieve LB compared to current practice.

About 40% of the SREs that we identified are within exonic regions of genes that are known to be involved in spermatogenesis, sperm motility, fertilization, and the first steps of embryogenesis before implantation. This corresponds to 85 different genes, some of which are altered in known cases of male factor infertility. For example, eight SREs are within transcripts underrepresented in asthenozoospermic patients, such as BRD2 and OAZ3 (13). Their lack would compromise motility and morphology and would not be bypassed by TIC or IUI. This could potentially be remediated with ART, because sperm would no longer have to transit the woman’s upper reproductive tract to reach the oocyte and only those embryos that have shown successful fertilization and initial development would be transferred to the women. However, some SREs are absent in patients in whom ART was not successful. Their contribution to the mechanism of successful fertilization and early embryogenesis remains to be elucidated. It is possible that these RNAs are critical for implantation and/or embryogenesis, and thus, in these instances, even ART cannot lead to a viable pregnancy, which is perhaps exemplified by the absence of an SRE located within the gene for transcription factor CAMTA2.

RNA-seq data also afford the opportunity to SNP-genotype each population of sperm RNAs that may reflect a series of health modifiers (27). For example, within this initial study, 102 SREs were derived from 35 genes that have been associated with a spectrum of genetic disorders from enolase deficiency to Parkinson disease. This is of note given the global allelic imbalance in the gene expression favoring the paternal expression of these genes associated with complex diseases (28) that may be compounded when the paternal effect of diet and environment on the future health of the child is considered (29). Continued development of this sperm RNA-seq methodology is expected to reveal genomic variants from these data (27) that will better explain the underlying origins of male infertility and may help predict the future health of the child.

The use of spermatozoal RNA NGS identified a set of molecular biomarkers that shows potential to predict the success rate of fertility treatments. In comparison to a smaller microarray-based study where 26 differential RNAs were identified (18), NGS analysis identified 648 SREs, suggesting that RNA-seq technology may more completely resolve variances in RNA profiles for the complex sperm cell. The statistically significant differences observed in the outcomes of noninvasive treatments depending on the presence or absence of the complete set of SREs (two-tailed Fisher’s test, P = 0.012) support the view that sperm RNA analysis has the potential to affect clinical care for idiopathic infertile couples when used to assess the likelihood of successful TIC/IUI before using ART. This may permit an informed choice of a treatment paradigm that would help the female partner avoid undergoing invasive procedures such as egg collection. The results of this 0.7-powered study should encourage a larger, blinded, and controlled prospective analysis of patients using noninvasive treatments to ensure the utility of this prediction method before its introduction into the fertility clinic. With the rapidly decreasing cost of NGS, deep sequencing of sperm RNA has the distinct potential to produce clinical benefit and enhance our understanding of the father’s contribution to the birth and life of a healthy child.


Experimental design

A retrospective study was designed to investigate whether spermatozoal RNAs could predict the outcome of various fertility treatment options used in the care of idiopathic infertile couples. RNA sequencing was used to obtain the spermatozoal RNA profile of patients included in the study. Several SREs required for natural conception were defined and tested using the sperm of idiopathic infertile males of couples that underwent noninvasive or invasive treatments. Sample size was dictated by the availability of patients matching the strict criteria in the fertility clinics, powering the study to 0.7 with an α error of 0.029.

Study subjects

Semen samples were collected after institutional review board (IRB) approval and informed consent from a total of 96 patients from the CReATe Fertility Centre, Toronto, Canada (site 1) and the Vincent Memorial Obstetrics and Gynecology Service, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA (site 2) and then processed and frozen at −80°C. Only couples presenting with infertility unexplained by standard procedures as confirmed by a reproductive endocrinologist and andrologist were recruited to the study. All participants underwent some reproductive treatment (TIC, IUI, IVF, or ICSI) because of their inability to achieve a successful LB by spontaneous conception. Noninvasive treatments (TIC and IUI) were the first fertility treatments used in about 80% of the patients, with a success rate of 65% in the first two cycles. After some unsuccessful IUI cycles (2.6 cycles on average), 26% of patients underwent ART treatments, achieving 85% success in the first cycle. The remaining 20% of patients made the initial personal treatment decision to directly undergo ART, with a success rate of 50% in the first cycle. Couples were excluded from this study if the male partner considerably deviated from the standard semen parameters. This included males presenting with less than 10 million motile sperm, indicating a decreased likelihood of successful IUI outcome (30). In addition, couples with a female partner exhibiting low ovarian reserve validated by anti-Müllerian hormone level <15.7 pM or known history of hormonal disorders, showing evidence of stage 3 or 4 endometriosis, a history of chemotherapy or pelvic irradiation, as well as patients unable or unwilling to consent were excluded from the study. Semen parameters were assessed as described in Supplementary Materials and Methods, and the day of semen evaluation was designated day 0. To minimize any potential effect on spermatozoal RNA by an external factor, we defined the control population (unassisted conception) as those couples achieving pregnancy in the first attempt of TIC within the first 90 days after sperm RNA analysis, corresponding to one complete spermatogenic cycle. The deidentified frozen samples were processed and analyzed at Wayne State University under the IRB protocols H-06-67-96 and HIC 095701MP2F.

RNA sequencing

Sperm RNA from 96 samples was isolated, quality-assessed, deep-sequenced, and analyzed as described in Supplementary Materials and Methods. The 72 samples that passed all sequence quality measures were divided into three groups for post hoc statistical analysis. Group I samples were used to determine the required SREs for “natural conception” LB. This positive control population was derived from seven couples that achieved an LB by controlling the optimal fertility window (TIC) during the first cycle monitored for intercourse timing. This was considered equivalent to natural conception. The abundance of all members of this composite group of SREs was assessed within the remaining 65 samples that underwent various fertility treatments. Group II test samples were composed of 55 couples, where 41 were initially treated by IUI or TIC. This included 22 couples that were successfully treated by IUI, 3 couples that were successfully treated by TIC after the first spermatogenic cycle, and 4 couples that discontinued treatment (i), in contrast to 12 couples who underwent ART after unsuccessful IUI cycles (ii). The remaining 14 couples personally decided to undergo ART after semen assessment (iii). Group III test samples were composed of (i) samples from five couples from an independent fertility clinic, all of whom achieved an LB, and (ii) four likely female factor couples, three of whom only achieved an LB with a GC suggestive of a female factor and the fourth presenting with stage 2 endometriosis, a known female factor. The results corresponding to the 72 RNA-seq data sets are available at the Gene Expression Omnibus (GEO) [National Center for Biotechnology Information (NCBI)] repository (GSE65683).

Statistical analysis

Statistical analyses were performed using SigmaPlot version 11 (Systat Software), and statistical tests were considered significant at P < 0.05. According to the normality of the parameter tested (male and female ages, sperm motility, and DFI are normally distributed; total millions of sperm cells, sperm morphology, and number of SREs absent are not normally distributed), a parametric one-way ANOVA test or nonparametric Kruskal-Wallis one-way ANOVA by ranks (α = 0.05) was used to detect differences in the average of different seminal parameters or SREs that were absent between the groups on the basis of the treatment used. According to the normality of the parameter tested, a parametric Student’s t test or nonparametric Mann-Whitney U test (two-tailed, α = 0.05) was used to detect differences in the average of different seminal parameters or SREs absent between the samples that achieved LB or failed (NLB) within each group. Two-tailed Fisher’s exact test (α = 0.05) was used to compare the success rate of different fertility treatments in the presence or absence of the complete set of SREs. G*Power version 3.1 was used to calculate the power of the two-tailed Fisher’s exact test (α = 0.05) (31).


Materials and Methods

Fig. S1. Distribution and junctions of RNA-seq reads of a selected required SRE, GPX4.

Fig. S2. No correlation between the number of absent SREs and sperm parameters or partner age.

References (3243)


  1. Funding: The work was supported by a Collaborative Translational Research Project grant from EMD Serono to S.A.K. and M.P.D., funds to S.A.K. through Charlotte B. Failing Professorship, and NIH grants ES017285 and ES009718 to R.H. Author contributions: M.P.D., S.I.M., C.L.L., S.S., and R.H. provided and characterized the clinical samples used in the study. M.J. and R.G. isolated the sperm RNA and prepared the deep-sequencing libraries. E.S., M.J., and S.A.K. contributed to the biocomputational analysis of the RNA-seq data and the interpretation of the results. The manuscript was collaboratively written by E.S., M.J., S.A.K., M.P.D., S.I.M., C.L.L., and R.H., and R.G., S.A.K., and M.P.D. directed the data analysis and writing and editing of the manuscript. All authors critically reviewed and approved the final version of the manuscript. Competing interests: S.A.K. received a speaker honorarium, and S.A.K. and M.P.D. received some research funding from EMD Serono. All other authors declare that they have no competing interests. Data and materials availability: RNA-seq data sets are available at the GEO (NCBI) repository (GSE65683).
View Abstract

Stay Connected to Science Translational Medicine

Navigate This Article