Technical CommentsINFERTILITY

Response to Comment on “Absence of sperm RNA elements correlates with idiopathic male infertility”

See allHide authors and affiliations

Science Translational Medicine  24 Aug 2016:
Vol. 8, Issue 353, pp. 353tr1
DOI: 10.1126/scitranslmed.aaf4550


RNAs from other cell types have minimal impact on male fecundity–associated sperm RNA elements.

The objective of the study by Jodar et al. (1) was to examine the general sperm transcript profile from different patients with idiopathic infertility presenting with normal semen parameters. Reference values of sperm parameters established by the World Health Organization in 2010 are at least 39 million sperm per ejaculate with 32% motility and 4% normal morphology (2). Accordingly, normal semen samples could present with 96% morphologically abnormal and 68% immotile sperm, indicating the heterogeneity of sperm population in each individual.

The technical comment from Cappallo-Obermann and Spiess (3) suggests that the samples used in Jodar et al. (1) also contained different somatic cell types. This assertion may reflect their past experience within their patient population, where 30% of the samples have a round cell concentration of >5 × 106 /ml. In contrast, the clinic from which most samples were obtained reported that only 4% of patients (49 individuals from 1230 unselected nonazoospermic patients) presented a round cell concentration of >1 × 106/ml (4). It is important to emphasize that all samples included in the study of Jodar et al. (1) had a round cell count of <1 × 106/ml before PureSperm gradient purification, with the exception of a single sample. This is substantially lower than the reference values recommended by the World Health Organization (2). In addition, none of the samples included in the study had a notable number of epithelial cells or other identifiable cells, as evaluated by optical microscopy after PureSperm (table S1).

Approaches to purify spermatozoa from semen such as swim-up (5) and density gradient centrifugation above 50% markedly decrease the recovery of spermatozoa. This is consistent with the physiological selection of the sperm. Moreover, as suggested by the decreased recovery of mitochondrial RNAs, somatic cell lysis buffer treatment likely compromises the midpiece of spermatozoa (6). Because the samples used in the study contained very low numbers of somatic cells, we chose the 50% PureSperm methodology for sperm purification. This minimized sperm selection while maintaining the integrity of spermatozoa. In those cases where round cells were detected by optical microscopy after the initial gradient, the 50% PureSperm gradient was repeated. Their efficient removal was verified by CD45/PTPRC reverse transcription polymerase chain reaction (RT-PCR) (table S1 and fig. S1).

Nevertheless, the technical comment (3) suggested that a high proportion of somatic transcripts reside in spermatozoal RNA sequencing (RNA-seq) data sets and now microarray data when sperm were purified using a 50% density gradient, which yields a sperm and/or sperm-enriched fraction (1, 7). In sperm, several RNA singularities should be considered before analyzing total RNA-seq data. One must consider the background level present in any total sperm RNA-seq data. The absolute abundance of sperm transcripts, including ribosomal and mitochondrial RNA, varies widely between samples. This likely reflects the end result of differential fragmentation and targeted RNA removal that occurs during spermatogenesis. In Fig. 1, note the expanse of the deep blue color that essentially corresponds to 0 values for most of the 278,604 possible RNA coding, noncoding, and sperm-specific RNA elements in the 72 samples used in the study. As shown in Fig. 1A, only a small group of the most abundant RNA elements (>90th percentile rank) exhibits uniform consistency among samples in >60% of the samples studied. In contrast, as emphasized in Fig. 1B, the majority of lower-ranked RNA elements (<90th percentile rank) consistently appear at the equivalent rank in less than 20% of samples studied. This reflects the degree of dispersion among individual sample rankings for elements contributing to the average element rank at that given level. Accordingly, rigorous criteria were developed to select only those consistently high-ranked sperm elements that are likely of critical and functional importance (1).

Fig. 1. Distribution of all possible RNA elements’ percentile rankings.

The 278,604 possible RNA-coding, noncoding, and sperm-specific RNA elements were measured in 72 sperm samples, and their abundance was ranked by percentile. Elements were sorted by average rank as measured in all samples, and the relative percentage of individual samples with specific rank at each average percentile tier was calculated. (A) On the basis of the sort by average rank, the surface plot shows the distribution of individual sample ranks as compared to the average. (B) Higher-resolution heat map with nonlinear scaling and a lower-threshold color scale bar* with higher-ranked sample fractions >0.20 combined. This resolves the relatively wide distribution among individual samples for lower-ranked elements (indicated in green). Both panels show minimal intersample variation for the top-tier rank elements (indicated in yellow) compared to the broad dispersion of lower-ranked elements.

As noted above, the first goal was the identification of sperm RNA elements (SREs) required for natural conception. These RNA elements were expected to be both consistently observed and at high levels in sperm of the seven fertile controls, those couples that achieved a live birth from natural conception during their first attempt at timed intercourse (TIC). Only a relatively small specific subset of elements with concordant high abundance among the fertile controls met these criteria, and these should be considered as candidate sperm benchmarks [see Fig. 1 in the study of Jodar et al. (1)]. It is also important to note that the possibility of transcripts of nondescript origin to be passengers in sperm was considered, because we have now begun to appreciate the communication pathways between sperm and their environment (810). It is for this reason that we chose the stringent criteria (average >99th percentile rank among the seven controls) to further consider 1223 sperm elements. As summarized in Table 1, of the 28 possible differentiated germ cell marker sperm elements derived from PRM1, PRM2, TNP1, LELP1, SMCP, and OAZ3 that were assessed in the technical comment (3), a total of 13 have an average percentile rank above the 99th percentile rank threshold. However, as described in our original methods, a stringent approach was taken to obtain a precisely defined set of elements that were consistently observed as highly abundant in sperm. These SREs were within the group that was greater than the 99th percentile rank among the seven fertile controls and did not present an outlier [percentile rank <Q1–1.5 × IQR (interquartile range)]. As shown in Table 1, only 6 of the 13 sperm elements that comprise these specific differentiated germ cell markers meet the stringent criteria to be included in the list of 648 retained sperm elements required for natural conception. Although it is likely that on the basis of our strict criteria we do not include some elements that are also important in sperm function, we specifically wanted to lessen the chance of type I error. The use of these accurately selected 648 SREs enabled the identification of a group of patients with a low success rate of conceiving using the less invasive techniques such as TIC and intrauterine insemination.

Table 1. SRE criteria for natural conception applied to differential germ cell markers.

Differentiated germ cell markers provided in the technical comment of Cappallo-Obermann and Spiess (3) were examined. The first SRE criterion for natural conception required that a sperm element exceed the 99th percentile rank on average across seven fertile control samples. From the 28 sperm elements examined, only 13 met the first criterion (marked with an asterisk, percentile rank average). Some elements, such as ODF1 and OAZ3, that did not satisfy the first criterion are moderately abundant (95th to 98th percentile rank). Other elements, such as PFN4 and ACRV1, are absent in the majority of control samples (zeroth percentile rank). These spermatid-specific RNAs are not present in mature spermatozoa and accordingly were not considered. If the first criterion was satisfied, the second outlier criterion was applied. This discarded those elements that did not satisfy the IQR rule. Of the 13 elements exceeding the 99th percentile range on average across the control samples, only 6 satisfied the IQR rule (marked with a dagger sign) and were retained as SREs from four genes: TNP1, LELP1, SMCP, and OAZ3.

View this table:

As expected, none of the 72 samples included in the study present any of the somatic cell markers (IL8, CD45/PTPRC, and CDH1) or the undifferentiated germ cell marker (KIT) elements at or above the 99th percentile rank (table S2). The background level (<90th percentile rank) of most of these elements and the inability to assemble full-length transcripts from these transcripts that in contrast to sperm do not present a biologically fragmented RNA population confirm the absence of considerable somatic cell contamination as observed by optical microscopy. These results are summarized in table S1 and supported by PTPRC quantitative RT-PCR (qRT-PCR) and sequencing of PTPRC as shown in fig. S1, which shows disjointed and marginal coverage, if any, throughout a few exons. Together, this shows that the relative amount, if any, of this or other transcripts is well below the threshold of reliable detection.

A few samples do present some elements above the 99th percentile rank, which are associated with or used as prostate and seminal vesicle markers. These transcripts are also present in human sperm RNAs prepared by swim-up (5). As above, it is important to note that these transcripts (Table 2) were not consistently observed (<Q1–1.5 × IQR) in all fertile controls, nor did they reach the threshold of an average percentile ranking >99. Accordingly, these elements did not meet the strict criteria to be included in the list of 648 retained sperm elements. To estimate their likely influence or dilutive effect, we performed a comparison of Human BodyMap 2.0 ( testes versus prostate transcript expression (Supplementary Materials and Methods). However, it is important to note that expression measured in the various cell types is quite different across different tissue expression databases, such as Human BodyMap 2.0 versus GTEx ( (11). This observation makes it all the more difficult to suggest that the observed presence of any specific markers is an absolute indication of presence of a specific cell type. With these caveats in mind, examining the distribution of prostate versus testes marker percentile ranks (Supplementary Materials and Methods) indicates that testes markers show significantly higher overall abundance rank in sperm as compared to the prostate marker transcripts (P = 0.0066; table S3). These comparisons suggest that even if prostate cells contribute in some part to the RNA queried, their effect on detection of sperm elements would be marginal. It is of note that the four seminal vesicle and prostate markers SEMG1, SEMG2, TGM4, and MSMB described in the technical comment (3) are among the more abundant transcripts in extracellular vesicles contained in semen (Supplementary Materials and Methods). Their relative rank based on exosomal RNA-seq data of the 23,181 transcripts assessed is as follows: MSMB, 6; SEMG1, 22; TGM4, 157; and SEMG2, 200. Seminal fluid contains an extremely large number of extracellular vesicles (1011 to 1013 particles per ejaculate) containing a large repertoire of RNAs and proteins (12). Most of these originate from the prostate, although other accessory sex glands also release extracellular vesicles, contributing to the population found among the epididymal and seminal vesicles (13). Our recent integrative analysis of human sperm and seminal fluid transcriptomic and proteomic data suggests that the seminal fluid and spermatozoa may communicate through extracellular vesicles (8). The enrichment of RNAs from seminal fluid extracellular vesicles in the peripheral membrane on mouse spermatozoa suggests the presence of extracellular vesicles on the spermatozoal surface, contributing to the growing evidence of sperm and seminal fluid communication through extracellular vesicles (9, 13, 14). It has recently been shown that communication through extracellular vesicles could be crucial for epididymal RNAs to be delivered to sperm to promote transgenerational epigenetic inheritance of metabolic phenotypes (10). Together, these results suggest that even the moderate to low levels of RNA elements corresponding to seminal vesicle and prostate markers likely arise from the interaction of extracellular vesicles released by the accessory sex gland and sperm but not from a contamination with cells from the prostate and/or seminal vesicles (Fig. 2).

Table 2. SRE criteria for natural conception applied to seminal vesicle and prostate markers.

Although some samples presented seminal vesicle and prostate markers as candidate sperm elements above the 99th percentile rank as indicated in bold text, none of the elements met the first criterion, above the 99th percentile rank on average across seven control samples (average percentile rank), so they were marked as variable and excluded from analysis.

View this table:
Fig. 2. Germ cell–differentiated and prostate RNA markers.

(A) GTEx expression of PRAC1, a prostate cancer antigen, is higher in the prostate than in the testis, suggesting that PRAC1 may be used as a prostate marker (upper panel). RNA-seq (lower panel) shows the presence of PRAC1 exclusively in the exosomal fraction. (B) GTEx expression of LELP1 is higher in the testis than in the prostate, suggesting that LELP1 could be used as a testis marker (upper panel). LELP1 that is present in sperm is exclusively from the testes. The absence of LELP1 in GTEx and the small amount of this transcript in the exosome fraction (lower panel) are consistent with an RNA transport pathway.

To fully evaluate the possible admixture effects in sperm RNA-seq samples, we took advantage of the comprehensive GTEx tissue RNA-seq database to identify transcripts that are specific to 11 tissues and cell types (testes, prostate, whole blood, skin, adipose, Epstein-Barr virus–transformed lymphocytes, fibroblast, pituitary, ovary, salivary gland, and adrenal). Table S4 provides a summary of marker transcripts identified, expressed as a function of the mean GTEx RPKM (reads per kilobase per million mapped reads) value. Marker transcripts were identified by virtue of being highly abundant in their host tissue and significantly lower in all other tissues assayed. This ranged from STAR, which was at least 10-fold lower (P = 3.6 × 10−21), to a median relative difference of 178-fold for other transcripts and to a maximum difference of 11,969-fold for PRL. This also excluded transcripts like IL8 that exhibit broad and overlapping expression patterns. Possible evidence for an admixture was then considered by assessing the RPKM observed in sperm for this set of marker transcripts. Table S5 shows the relative amount of the most prominent tissue marker for each of the seven fertile sperm controls. It is apparent that the tissue-sourced RNAs show little or no presence in the sperm controls. Comparison of the PRM1/KLK3 ratio, representing testes/prostate markers, extends over a broad range with a minimum of 10-fold enrichment of PRM1 above its comparator.

To assess whether the presence of these transcripts could affect the identification of the required SREs, the relative percentile rank of each SRE in each of the 65 group II samples was compared with that of KLK3. Among the 648 SREs identified (1), 14 displayed a Spearman rank coefficient of ρ > 0.8 when compared to KLK3. These included four elements from EEF1A1, two elements from RPS41, two elements from RPS24, and single elements from RPL29, RPL37A, EEF1G, RPL3, EEF2, and TPT1. Only two of these elements, both exons of RPS24, affect the end result of the live-birth outcome analysis. These elements were both noted to be absent in a single test sample and, if excluded from the SRE group, alter the prediction of this single sample to expectation of live birth, which is the true outcome. This suggests that the influence of a possible admixture on the SREs was minimal. The methodology to assess the effects of possible admixture that is described above may prove to be a valuable tool for others.

It is clear that sperm RNA-seq is far more complex than that of a somatic cell and can be likened to that of formalin-fixed paraffin-embedded samples because of the intrinsic characteristics of spermatozoal RNA, the heterogeneity of semen, and the differences in obtaining, preserving, and processing a semen sample. As we have shown, the focus of this study on the consistently abundant and stable spermatozoal RNA elements effectively mitigates these potentially confounding factors, thereby promoting their potential use in the clinical setting.


Materials and Methods

Fig. S1. Determining the abundance of PTPRC by qRT-PCR and by RNA-seq.

Table S1. Summary of optical microscopy and qRT-PCR of PTPRC (provided as an Excel file).

Table S2. Percentile ranks of IL8, CD45/PTPRC, CDH1, and KIT (provided as an Excel file).

Table S3. Comparison of testes and prostate transcript markers (provided as an Excel file).

Table S4. Tissue-specific markers (provided as an Excel file).

Table S5. Comparative expression of tissue markers in seven control sperm samples (provided as an Excel file).


Stay Connected to Science Translational Medicine

Navigate This Article