Research ArticleImmunology

Genetic and Environmental Determinants of Human NK Cell Diversity Revealed by Mass Cytometry

See allHide authors and affiliations

Science Translational Medicine  23 Oct 2013:
Vol. 5, Issue 208, pp. 208ra145
DOI: 10.1126/scitranslmed.3006702


Natural killer (NK) cells play critical roles in immune defense and reproduction, yet remain the most poorly understood major lymphocyte population. Because their activation is controlled by a variety of combinatorially expressed activating and inhibitory receptors, NK cell diversity and function are closely linked. To provide an unprecedented understanding of NK cell repertoire diversity, we used mass cytometry to simultaneously analyze 37 parameters, including 28 NK cell receptors, on peripheral blood NK cells from 5 sets of monozygotic twins and 12 unrelated donors of defined human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptor (KIR) genotype. This analysis revealed a remarkable degree of NK cell diversity, with an estimated 6000 to 30,000 phenotypic populations within an individual and >100,000 phenotypes in the donor panel. Genetics largely determined inhibitory receptor expression, whereas activation receptor expression was heavily environmentally influenced. Therefore, NK cells may maintain self-tolerance through strictly regulated expression of inhibitory receptors while using adaptable expression patterns of activating and costimulatory receptors to respond to pathogens and tumors. These findings further suggest the possibility that discrete NK cell subpopulations could be harnessed for immunotherapeutic strategies in the settings of infection, reproduction, and transplantation.


Natural killer (NK) cells were discovered in the 1970s in both mice (1) and humans (13). They were first characterized by their ability to rapidly kill tumor cells and later virus-infected cells (2, 4). Since this initial characterization, the nuanced complexity of NK cell phenotypes and function has been increasingly appreciated (5). The diversity of the NK cell repertoire is determined by the expression of an array of germline-encoded activating and inhibitory receptors (57). These receptors interact with major histocompatibility complex class I and class I–like molecules, costimulatory ligands, stress-related molecules, and cytokines (810). The combinatorial expression of this multitude of receptors on an NK cell results in an abundance of mathematically feasible subpopulations, but this diversity has yet to be captured in sufficient detail.

Early studies using NK cell clones and mRNA expression data (1113), more recent work incorporating monoclonal antibodies against a variety of NK cell receptors (14, 15), and bone marrow transplant data showing the similarity of recipient to donor inhibitory repertoires (16) all demonstrate that host genetics are a critical determinant of inhibitory receptor expression patterns. Additional nongenetically encoded determinants of diversity may influence activating receptor expression, including NK cell education, epigenetics, epistatic interactions between killer cell immunoglobulin (Ig)–like receptor (KIR) genes, and viral infections, particularly cytomegalovirus (CMV) (1724). These observations emphasize the need for a better understanding of the relationship between genotype, phenotype, and function in the formation of the NK cell repertoire.

Advances in fluorescence cytometry have laid the groundwork for understanding the NK cell repertoire, but the spectral limitations of fluorescence have prevented exploration of its full extent. The recent development of mass cytometry, also known as cytometry by time of flight (CyTOF), provides an opportunity to overcome these limitations. Mass cytometry uses rare metal isotope-conjugated antibodies to simultaneously detect up to 40 cellular markers. Because the metal isotopes are not naturally found in biological systems and have distinct time-of-flight profiles, background is nearly undetectable and the need for compensation across channels is eliminated. Mass cytometry has been used to profile human hematopoiesis (25) and to demonstrate the tremendous diversity within the CD8+ T cell compartment (26).

To provide a framework to better understand the NK cell repertoire, we used mass cytometry in combination with high-resolution genotyping to evaluate the phenotypic heterogeneity of peripheral blood NK cells in 22 healthy individuals, including 5 sets of monozygotic twins. Our study shows a remarkable breadth and diversity in the human NK cell repertoire and the role of genetics and the environment in its formation and maintenance.


Mass cytometry for dissecting the phenotypic diversity of NK cells

To better define the phenotypic diversity of human peripheral blood NK cells, we designed a mass cytometry panel of 36 monoclonal antibodies recognizing lineage markers and NK receptors plus lanthanide-loaded DOTA-maleimide for dead cell discrimination (table S1). Because this was the first mass cytometry panel specific for NK cell markers, we compared results between mass cytometry and fluorescence cytometry. All antibodies successfully and comparably distinguished cell populations on both platforms, with representative plots shown in fig. S1, A and B. Staining with isotype control antibodies revealed minimal nonspecific background by mass cytometry (fig. S1C). We also compared NK cells stained with or without pretreatment with a human Fc receptor blocking solution and observed no significant differences in antibody binding (fig. S1D). Adding to the versatility and applicability of this approach, very similar NK cell staining profiles were observed for fresh and cryopreserved peripheral blood mononuclear cell (PBMC) samples (fig. S1, E and F). NK cells identified with a serial gating strategy demonstrated a range of surface receptor expression levels in different individuals (fig. S2).

Overall diversity of the NK cell receptor phenotype

With this mass cytometry platform, we interrogated the diversity of the NK cell repertoire in a cohort consisting of 12 healthy, unrelated individuals and 5 sets of monozygotic twins of known human leukocyte antigen (HLA) and KIR genotype (Fig. 1 and fig. S3). After identification of NK cells with lineage markers, Boolean analysis was used to classify cells as positive or negative for the expression of each of 28 NK cell receptors (fig. S4). This allowed for the assessment of 228, or 268,435,456, combinations of these receptors. The results revealed a much greater heterogeneity in NK cell phenotypes than has been described previously (Fig. 1). No single phenotype accounted for more than 7% of the total NK cells, and only 14 of 28 markers tested were expressed by NK cells exhibiting the top 50 most frequent phenotypes (Fig. 1A). Furthermore, the subsets comprising the top 50 phenotypes accounted for an average of just 15% (range, 7 to 24%; SD, 4.8%) of an individual’s NK cells.

Fig. 1. Assessment of NK cell repertoire diversity based on expression of individual receptor profiles.

(A) Frequencies of the 50 most abundant NK cell phenotypes based on expression of 28 receptors for 12 unrelated individuals (HNK-001 to HNK-012) and 5 pairs of monozygotic twins (Twin-1a/1b to Twin-5a/5b). Each column represents a phenotype, with colored boxes indicating receptor presence. (B) Comparative correlation of each twin with his/her identical twin of the frequencies of the 50 phenotypes shown in (A). Spearman’s correlation coefficient is displayed. (C) Comparative correlation of each healthy unrelated individual with the other 11 individuals of the frequencies of the 50 phenotypes shown in (A).

Although the conserved activating receptors NKG2D and NKp46 are usually described as being expressed on most human NK cells (27, 28), they were expressed on just 28 and 25 of the top 50 phenotypes, respectively. This low expression is partially a result of our Boolean gating strategy, in which only NK cells that were relatively bright for NKG2D and NKp46 were considered positive (fig. S4). Thus, we may have underestimated the frequency of these markers, particularly for cells that express low levels. Nonetheless, these results were concordant with the frequencies of NKG2D and NKp46 in total NK cell populations in this study, where NKG2D and NKp46 were highly expressed on an average of 48% (range, 29 to 67%) and 47% (range, 28 to 71%) of NK cells, respectively, and with additional studies in which <100% of NK cells expressed these markers (27, 2932). The conserved inhibitory receptor NKG2A was more prevalent, found on 40 of the top 50 phenotypes and always coexpressed with CD94, its partner polypeptide. In contrast with NKG2A, the highly variable KIRs contributed to relatively few phenotypes. KIR2DL1, KIR2DL5, and KIR3DL1 were not present in the 50 most frequent phenotypes. KIR2DL2/L3/S2, the most prevalent KIR group, was expressed on an average of 30% of an individual’s total NK cells, but on only 4 of the 50 most frequent subpopulations, which together represented an average of only 3.2% (range, 0.8 to 6.8%) of KIR2DL2/L3/S2-expressing cells. Thus, the remaining >93% on NK cells were distributed across a range of phenotypes, each represented by very few cells. These data reveal that a diverse spectrum of low-abundance NK cell phenotypes express inhibitory receptors.

To assess the genetic influences on the phenotypes of NK cells, we compared the frequencies of the top 50 NK cell phenotypes within paired monozygotic twins (Fig. 1B) and pairs of unrelated donors (Fig. 1C). Twin pairs exhibited modest concordance in the frequencies of NK cell subpopulations (Spearman’s ρ = 0.65), although such concordance was diminished in pairs of unrelated donors (Spearman’s ρ = 0.36). These data suggest that genetic differences between individuals may influence the repertoire of NK cell surface phenotypes.

NK cell receptor diversity in the expression of inhibitory receptors for HLA class I

We next restricted the Boolean analysis to the six inhibitory NK cell receptors KIR2DL1, KIR2DL2/L3/S2, KIR2DL5, KIR3DL1, NKG2A, and LILRB1, resulting in 64 possible combinatorial cell surface phenotypes. Of these possibilities, the most frequently detected phenotypes in each individual included none of the six inhibitory receptors or only NKG2A (Fig. 2A). These two subsets accounted for an average of 54% (range, 41 to 72%) of an individual’s NK cells. Most remaining NK cells expressed one to three inhibitory receptors. These results are consistent with those obtained using fluorescence cytometry (15, 18, 33, 34).

Fig. 2. Expression patterns of inhibitory NK cell receptors in unrelated individuals and monozygotic twins.

(A) Inhibitory receptor profile of NK cells in 12 unrelated individuals and 5 pairs of monozygotic twins as in Fig. 1A. The analysis was restricted to the evaluation of expression profiles of the six inhibitory receptors KIR2DL1, KIR2DL2/L3/S2, KIR2DL5, KIR3DL1, LILRB1, and NKG2A. (B) Comparative correlation of each twin with his/her identical twin of the frequencies of the 50 phenotypes shown in (A). Spearman’s correlation coefficient is displayed. (C) Comparative correlation of each healthy unrelated individual with the other 11 individuals of the frequencies of the 50 phenotypes shown in (A).

In analyses parallel to those described in Fig. 1 (B and C), we compared the expression frequency of the inhibitory receptor combinations between paired monozygotic twins (Fig. 2B) and between pairs of unrelated individuals (Fig. 2C). Concordance was extremely high within the pairs of monozygotic twins (Spearman’s ρ = 0.87) and less marked in unrelated individuals (Spearman’s ρ = 0.77). These results suggest that host genetics determine the inhibitory receptor repertoire but only weakly affect the expression of other NK cell receptors (Fig. 1, B and C). Together, these data indicate that although the inhibitory repertoire appears to be largely controlled by genetics, environmental and/or stochastic influences play a leading role in determining the overall repertoire of NK cell phenotypes.

Assessing the NK cell repertoire with clustering analysis

Boolean analysis of the mass cytometry data resulted in thousands of distinct NK cell phenotypes defined by combinatorial receptor expression. However, because of the abundance of low-frequency populations, we used a clustering algorithm called spanning-tree progression analysis of density-normalized events (SPADE) (25, 35). In SPADE analyses of PBMCs, nodes of phenotypically similar cell clusters corresponding to NK cells, CD4+ T cells, CD8+ T cells, B cells, and monocytes/macrophages were readily identified and distinguished (fig. S5).

Consistent with the Boolean analysis, the distribution of inhibitory receptor expression on NK cells revealed by SPADE was more similar for monozygotic twins (Fig. 3A and fig. S6) than for unrelated individuals (Fig. 3B and fig. S7). For instance, KIR2DL1 was observed on similar subpopulations in twins but not in unrelated individuals. More diversity was observed in the intensity and frequency of expression of activating receptors, including NKG2D, which was expressed at varying intensities and frequencies among different clusters between individuals, even in twins. Additionally, HNK-003 and HNK-007 were both genotypically and phenotypically divergent for KIR2DS4, consistent with HNK-007 lacking a functional copy of the KIR2DS4 gene (Fig. 3B and fig. S3). However, even the genotypically identical Twin-5a and Twin-5b showed differences in KIR2DS4-expressing nodes, with 28 KIRS2DS4-expressing nodes in common but 10 with differential expression. These data further illustrate the impact of environmental factors on the total NK cell repertoire.

Fig. 3. Clustering analysis of NK cells using SPADE reveals diverse receptor expression patterns.

(A) Representative SPADE trees of NKG2A, NKG2D, KIR2DL1, and KIR2DS4 for monozygotic Twin-5a (top) and Twin-5b (bottom). Node color represents signal intensity of each marker, and size represents frequency. (B) Representative SPADE trees of NKG2A, NKG2D, KIR2DL1, and KIR2DS4 for two healthy unrelated individuals: HNK-003 (top) and HNK-007 (bottom).

A comprehensive phenotype for CD57+NKG2C+ “memory-like” NK cells

SPADE analysis showed that five of the unrelated donors (HNK-002, HNK-004, HNK-005, HNK-007, and HNK-008) and one pair of twins (Twin-5a and Twin-5b) had large NK cell subpopulations coexpressing CD57 and NKG2C (Fig. 4A) that could represent memory-like NK cells generated in response to CMV infection (23, 24, 36, 37). All of these subjects tested had positive serology for CMV except for HNK-008 (fig. S3). The remaining CMV-seropositive subjects (HNK-001, HNK-012, Twin-4a, and Twin-4b) also had NKG2C+CD57+ NK cell subpopulations, although these populations were less prominent. The CD57+NKG2C+ cells expressed CD94 and CD16, but had low expression of CD122 and KIR3DL1, and variable patterns of expression of KIR2DL1, KIR2DL3, NKG2D, NKp30, NKp46, and CD8 (Fig. 4, B and C). In the donors with only the HLA-C1 epitope, NKG2C+CD57+ NK cells preferentially expressed KIR2DL3, the inhibitory HLA-C1 receptor, whereas donors with the HLA-C2 epitope preferentially expressed KIR2DL1 (Fig. 4).

Fig. 4. Identification and phenotypic evaluation of CD57+NKG2C+ memory-like NK cells.

(A) Skeleton structures of a SPADE tree from healthy unrelated individuals (HNK) and monozygotic twins (Twin) with NKG2C-expressing nodes highlighted with red shading. (B) Representative SPADE trees of the NKG2C-expressing nodes shown in (A) from five unrelated individuals. (C) Representative SPADE trees of NKG2C-expressing nodes for one pair of monozygotic twins, as in (B).

NK cell receptor coexpression patterns

We next asked whether any single NK cell receptors were more likely to be expressed with other single receptors on the same NK cell. We therefore calculated the Spearman’s rank correlation of each possible receptor pair on each NK cell from all 22 donors (Fig. 5A). Weak negative associations between the expression of CD57 and NKp46, CD122, CD94, and NKG2A and weak positive associations between the expression of NKG2A, CD94, and CD56 were the most dominant features.

Fig. 5. NK cell receptor coexpression patterns and population structure.

(A) Spearman’s rank correlation matrices of coexpression profiles for each receptor pair for all NK cells from all 22 donors. (B) Hierarchical clustering dendrogram of NK cell receptor profiles from all NK cells from all 22 donors. (C) Principal components analysis of receptor expression patterns from all NK cells from all 22 donors. Dots indicate the contributions of each receptor to principal component 1 (PC1) (x axis) versus principal component 2 (PC2) (y axis).

Given the relative paucity of strong receptor coexpression patterns, we performed hierarchical clustering of NK cell populations on the basis of surface receptor expression patterns to better understand NK cell population structure (Fig. 5B). The markers CD16, CD56, CD57, CD94, and NKG2A were distinct in their location at the root of the tree, whereas most other NK cell receptors contributed equally to the overall population structure (Fig. 5B). Further, principal components analyses indicated that NK cell populations expressing CD16, CD56, CD57, NKG2A, and CD94 were distinct and formed separate clusters (Fig. 5C). Overall, these results are consistent with a stochastic expression of most NK cell receptors, with less mature NKG2A+CD94+ and more mature CD16+CD57+ cells representing the only major distinction.

Quantifying the diversity of the NK cell repertoire

Having observed a stochastic expression of most NK receptors, leading to a broad range of possible combinatorial phenotypes on a single cell, we next sought to quantify this high level of diversity. We reasoned that our system of NK cell phenotypes resembled a collection of species within a habitat, and adapted established ecological methods. We first calculated the inverse Simpson diversity index, which measures the probability that two randomly selected organisms from a population will be of the same species (3840). This index accounts for both the total abundance and the distribution of species within a population. We first considered each NK cell as an individual organism, each NK cell phenotype as a species, and the group of phenotypes expressing each cell surface marker as a population. Inverse Simpson indices for each population illustrate the contribution of each marker’s cellular distribution to a donor’s total phenotypic diversity (Fig. 6A).

Fig. 6. Genetics dictate the expression of most single NK cell receptors, but not the combinatorial assortment of phenotypes.

(A) Inverse Simpson for the population of cells expressing each single receptor. Box plots show the median, 25th and 75th percentiles, and total range of all 22 donors. Yellow lines show a representative twin pair. Red line indicates an unpaired representative twin. Red and yellow colors correspond to matching donors in (D). (B) Summary of intraclass correlation coefficients of the inverse Simpson index of twin pairs versus all 22 individuals. (C) Histogram of inverse Simpson indices for the total NK cell population of each donor. (D) Total phenotypes predicted by the Chao 1 nonparametric species estimator (calculated as described in Materials and Methods) for all donors.

Receptors that were highly expressed on NK cells, such as CD56, CD16, CD94, and 2B4, had high average inverse Simpson indices, indicating their contribution to numerous NK cell phenotypes (Fig. 6A). Strikingly, we also observed a broad range within the cohort in the diversity of the NK cell populations expressing these and other receptors. For example, although the average inverse Simpson index of CD56-expressing populations was 659, the minimum was 135 and the maximum was 1017, yielding a range of 882. We expected highly polymorphic KIRs to exhibit broad ranges of inverse Simpson indices, which was true for KIR3DL1 (range, 844), KIR2DS4 (range, 582), and KIR2DL1 (range, 460). However, many nonpolymorphic or minimally polymorphic receptors such as CD16 (range, 938) and 2B4 (range, 691) showed similar or even greater ranges of inverse Simpson indices. Thus, we observed a high degree of variability among donors in the diversity of the NK cell populations on which specific receptors were expressed, which was not wholly dependent on allelic polymorphism.

We next asked whether host genetics played any role in determining this variability by calculating the correlation between the inverse Simpson indices for each receptor in each pair of twins versus all unrelated individuals. Despite the broad range in receptor diversity across individuals, the inverse Simpson indices for most receptors within twin pairs were remarkably similar, with high intraclass correlation coefficients (>0.6) for all receptors except CD56, 2B4, CD127, CD117, LILRB1, HLA-DR, and CCR7 (Fig. 6B). This suggests that host genetics may be a determinant of the range of NK cell phenotypes on which a given receptor is expressed, with factors in addition to receptor polymorphisms exerting considerable influence on each receptor’s diversity.

Next, to understand the range of NK cell diversity in a human population, we calculated the inverse Simpson indices for each individual donor, rather than each receptor (Fig. 6C). Eighteen of the 22 individuals had total inverse Simpson indices in the relatively restricted range of 350 to 900, suggestive of an optimal level of human NK cell diversity.

We next asked how many total NK cell phenotypes were likely to be present in each individual. We used the Chao 1 nonparametric species estimator, which projects the total number of species present in a sample by using the abundance of rare species as a predictor of undiscovered species (38). This method predicted between 6000 and 30,000 distinct NK cell phenotypes within each individual, with a median of 15,000 (Fig. 6D). In this analysis, twins showed a low intraclass correlation of 0.278 (Fig. 6B), suggesting that the combinatorial assortment of receptors in distinct NK cell phenotypes may be more a function of environmental than genetic influences.

Calculating the expanse of the NK cell repertoire

Finally, we calculated how many NK cell phenotypes were likely to be present at a population level. Because our experimental analyses of 3500 to 35,000 NK cells per donor were unlikely to have captured the total NK cell diversity, we generated a sample-based rarefaction curve to provide an estimate of the total number of NK cell phenotypes as a function of the number of phenotypes sampled (Fig. 7A). This analysis estimated the expansiveness of this cohort’s repertoire at 124,651 NK cell phenotypes. We also predicted the total number of NK cell species on a population level using three other independent and well-established ecological methods (38). Together, despite their variable estimation approaches and input parameters, all of these calculations yielded remarkably similar estimates of 108,000 to 125,000 total NK cell phenotypes (Fig. 7B).

Fig. 7. Calculating the expanse of the NK cell repertoire.

(A) Sample-based rarefaction predicting the total number of NK cell phenotypes within the cohort of 22 donors. The number of phenotypes observed was calculated for incremental inputs of 50,000 NK cells up to and including the total 304,909 NK cells sampled in all 22 donors, reshuffling the order of donor sampling each of 10 times. These data were used to plot a curve, which, when extrapolated to an asymptote, provided an estimate of the total NK cell population. (B) Nonparametric species estimators confirm quantification of the total number of NK cell phenotypes.


Using multiparametric mass cytometry to examine expression of 28 NK cell surface markers, we achieved an incisive dissection of human NK cell repertoires. A vast phenotypic diversity was uncovered, which is influenced by the genetic differences between individual humans as well as by the environmental differences they experience. Genetic differences strongly influence the combinatorial expression patterns of the inhibitory receptors that recognize HLA class I. This repertoire of inhibitory receptors, which is required for maintenance of self-tolerance as well as for NK cell “education” or “licensing,” is overlaid by an environmentally influenced diversity in the expression of activating and costimulatory receptors. This complementary combination results in NK cell repertoires that comprise between 6000 and 30,000 phenotypically distinguishable subpopulations. Beyond such diversity within an individual is further variation between individuals. More than 100,000 NK cell subpopulations were distinguished in the small “population” of 22 people we studied. These data reveal a potential mechanism by which NK cells could maintain self-tolerance through strictly regulated expression of inhibitory receptors while using more malleable and adaptable expression patterns of activating and costimulatory receptors to respond to infections and cancers.

Comparing adult monozygotic twins distinguished the effects of genetic and environmental factors. The NK cells from twins had very similar patterns of expression of inhibitory HLA class I receptors, consistent with previous comparisons of unrelated individuals showing that the inhibitory receptor repertoire is influenced by variability in the KIR and HLA class I genes (1, 11, 12, 15, 18, 33). Although very similar overall, the inhibitory receptor repertoires of twins exhibited some differences, which might be explained by environmental influences, including well-established epigenetic alterations of the KIR genes (1820).

We observed extensive diversity within populations of cells expressing each combination of inhibitory receptors. For instance, the only other marker consistently coexpressed on NKG2A-expressing cells is CD94, its partner polypeptide. Even coexpression of markers believed to be expressed on all or most NK cells, such as NKp46 and NKG2D, was highly variable. Thus, NKG2A is not expressed by one or a few subpopulations of NK cells, but by thousands. Because both NKG2A and its ligand, HLA-E, are highly conserved (6, 41), this diversity may provide an additional degree of flexibility to the repertoire.

Associated with the paucity of pairwise combinations of receptors in the overall NK cell population (Fig. 5) is our finding that the expression pattern for most of the receptors (including inhibitory KIRs, the C-type lectin receptors, and natural cytotoxicity receptors) is determined solely by the genes encoding the receptor (Fig. 6). This suggests a predetermined level for both the expression and distribution of these receptors by NK cells. Several other receptors—HLA-DR, CCR7, CD127, CD117, 2B4, and LILRB1—had diversity patterns more suggestive of environmental influence. These receptors are known to vary in response to activation [HLA-DR and CCR7 (8, 42)], differentiation [CD117 (43)], infection [2B4 and LILRB1 (14, 44)], and pregnancy [LILRB1 (45)].

Variation in a receptor’s expression within an individual could be explained by genetic polymorphisms that affect expression level (1820). In particular, KIR polymorphisms greatly influence the avidity and specificity of these interactions, leading to a highly variable repertoire (17, 46). Bolstering this argument is the broad range of expression by KIR3DL1, the most diverse of all KIR (47), and KIR2DS4 (48, 49). CD16, a less polymorphic receptor, is the most diverse of the NK cell receptors in its expression across cellular subpopulations. One interpretation of this finding is that a diversity of NK cells facilitates the critical function of CD16 in mediating antibody-dependent cellular cytotoxicity (ADCC), a bridge between innate and adaptive immunity. Alternatively, the diversity of CD16 could reflect its role as the only receptor demonstrated to be sufficient for NK cell activation (50).

“Null” cells, which express none of the cell surface markers other than CD56, are the most frequent NK cell subpopulation. They could represent a transitional state capable of up-regulating activating or inhibitory receptors. If so, they may be less likely to up-regulate expression of inhibitory receptors, because the inhibitory receptor expression patterns are extremely stable over time (51). However, even cells with no inhibitory receptors may play a functional role. Although these “uneducated” cells lacking receptors for self-HLA are hyporesponsive to HLA-deficient target cells (52), uneducated cells can be effective in ADCC (53).

Our results suggest that the human NK cell repertoire is anchored by the combination of a mature CD16/CD57-expressing population and a less mature NKG2A/CD94-expressing population. Stochastic assortment of additional receptors creates tremendous additional diversity within this framework. Expansions of NKG2C+CD57+ NK cells that preferentially expressed KIR reactive with self–HLA-C molecules were identified in several healthy, mostly CMV-seropositive, donors. These memory-like cells had a mature phenotype, as previously reported (22, 54). Despite this subset’s presumed expansion in response to a single pathogen, CMV (17), these memory-like NK cells had highly variable expression levels of other markers, further underlining the overall diversity of the NK cell population.

Several limitations of the study should be noted. First, our sample size was modest and serial sampling was not performed, which limited our ability to compare NK cell phenotypes between individuals of similar KIR/HLA genotype. In addition, the number of NK cells sampled per donor was between ~3000 and ~35,000, which is well below the theoretical number of 268,435,456 potential phenotypes, limiting our sampling of rare phenotypes. Finally, by using a Boolean gating strategy in which we defined cell populations as positive for a given receptor only if high levels were expressed, we may have underestimated the frequency of some receptors.

NK cells are quick, versatile lymphocytes that function in innate immunity, adaptive immunity, and reproduction. In performing these vital functions, we now see how each human individual has an almost incredible diversity of NK cell subsets. Furthermore, the repertoire of these subsets differs greatly from one person, or patient, to another. Among these different subsets are developmental intermediates in pathways of maturation and differentiation, as well as different types of effector NK cell. The latter include effector and memory NK cells, which exhibit, either alone or in combination, some measure of specificity for a pathogen, tumor, or allogeneic tissue. The immediate challenge for research is to define the physiological contributions of the different NK cell subsets. The future challenge will be using that knowledge to design specific, effective NK cell–mediated therapies against infectious, malignant, and reproductive diseases.


Study design

The objective of this study was to profile the baseline diversity of the healthy human NK cell repertoire, providing a framework to understand the roles of NK cells in health and in disease pathogenesis. To this end, we performed an observational study of 22 healthy individuals, including 5 pairs of monozygotic twins, a sample size that was determined both by feasibility and to allow us to sample a representative pool of donors of different KIR and HLA genotypes. This was an observational study; randomization and blinding were not used. All phenotypic analyses were performed ex vivo and not run in replicates. Analyses of twins versus unrelated individuals allowed us to examine the influence of host genetics versus environment on the NK cell repertoire.

Human subjects and cells

This study was performed in accordance with the Declaration of Helsinki and approved by the Institutional Review Boards of Stanford University and SRI International. Cryopreserved PBMCs were obtained from three sources. First, 10 generally healthy individuals (5 male, 5 female; aged 24 to 60 years; median age, 34 years) were recruited at Stanford University. Second, PBMCs from two healthy anonymous donors were obtained through purchase from the Stanford Blood Center. Third, PBMC samples from five sets of monozygotic twin pairs (four male, six female; aged 20 to 40 years; median age, 23 years) were collected as part of an ongoing study of influenza vaccination involving participants in the Twin Research Registry at SRI International. Full details of the twin registry have been described (55). All subjects gave fully informed and written consent.

Staining, data acquisition, and analysis

PBMCs were thawed and washed with RPMI 1640 (Corning Cellgro) with 10% heat-inactivated fetal bovine serum, 2 mM l-glutamine, and antibiotics [penicillin (100 U/ml) and streptomycin (100 μg/ml)] (Gibco BRL/Life Technologies) and rested at 37°C with 5% CO2 for 4 hours. Two million PBMCs were stained for mass cytometry analyses as described (25, 26) with the antibodies listed in table S1, and data were acquired on a CyTOF instrument (DVS Sciences). Fluorescence cytometry analyses were acquired on an LSR II (BD Biosciences) with the antibodies in table S2. FCS files were analyzed with FlowJo software v9.4.8 (Tree Star Inc.). Boolean analyses were performed with a custom Python script ( to sort cells on the basis of positive or negative expression of each marker. SPADE analyses were performed as described (35).

Antibody conjugation

Antibodies were purchased from companies specified in table S1 and labeled with MaxPar X8 labeling reagent kits (DVS Sciences) according to the manufacturer’s instructions.

KIR and HLA genotyping and CMV serology

KIR gene presence and HLA-A,-B,-C alleles were determined by polymerase chain reaction–based sequence-specific oligonucleotide probe with a Luminex 100 instrument (Luminex Corp.). The assays were performed with LABType SSO reagents (One Lambda). KIR genotyping at allele-level resolution was determined by pyrosequencing as described (56). DNA from Twin-1 was not available for typing, but monozygosity was previously established (55). CMV serology was obtained with the anti-CMV IgG ELISA (enzyme-linked immunosorbent assay) kit (Gold Standard Diagnostics) according to the manufacturer’s suggestions. Plasma samples for CMV testing from the healthy unrelated individuals were obtained about 1 year after PBMC sampling.

Quantification of diversity

The inverse Simpson index was calculated with the following equation:Embedded Imagewhere S is the total number of species, and pi is the proportional abundance of the ith species.

Sample-based rarefaction

Mean species occurrences of the number of phenotypes observed were plotted, and the curve was projected by fitting the following asymptotic equation:Embedded Imagewhere n is the number of species included, Smax is the total number of species projected, and B is a fitted constant.

Nonparametric species estimators

The Chao 1 estimator is given by the following equation:Embedded Imagewhere Sobs is the number of species observed, F1 is the number of species observed in exactly one cell, and F2 is the number observed in exactly two cells.

The Chao 2 estimator, which incorporates occurrence rather than abundance data, is given by the following equation:Embedded Imagewhere Q1 is the number of species observed in exactly one donor, and Q2 is the number observed in exactly two donors.

The second-order Jackknife estimator, which reduces sampling bias via calculation with a subset of the data removed, is given by the following equation:Embedded Imagewhere m is the total number of samples.

Statistical analysis

Statistical analyses were performed with Excel (Microsoft Corp.), STATA v10 (Stata Corp.), Prism v5 (GraphPad Software Inc.), and the open-source statistical package R (


Fig. S1. Validation of mass cytometry panel for phenotyping NK cells.

Fig. S2. Gating strategy for defining NK cells and ranges of frequencies of NK cells expressing each individual marker.

Fig. S3. KIR and HLA class I genotyping.

Fig. S4. Representative two-dimensional mass cytometry plots of NK cells from one representative donor (HNK-002) showing expression of all 28 NK cell receptors individually when expressed against CD56.

Fig. S5. SPADE trees showing distribution of cell lineage markers across unfractionated PBMC.

Fig. S6. Clustering analysis of NK cells from Twin-5a and Twin-5b using SPADE.

Fig. S7. Clustering analysis of NK cells from HNK-003 and HNK-007 using SPADE.

Table S1. Mass cytometry panel antibody information.

Table S2. Mass cytometry isotype control and LSR antibody panel information.


  1. Acknowledgments: We thank E. Newell, E. Simonds, A. Moudgil, J. Haddon, H. Hilton, and P. Bollyky for helpful discussions and/or critical reading of the manuscript. Twins were recruited from the Twin Research Registry at SRI International; we wish to thank M. McElroy, L. Jack, R. Krasnow, J. Rubin, D. Basin, L. Panini, and M. Ritchey, and funds from SRI’s Center for Health Sciences and NIH grants DA011170, DA023063, AI057229, AI090019, and ES022153. We also thank the twins for their contributions to science. We thank the clinical staff of the Stanford–Lucile Packard Children’s Hospital Vaccine Program including S. Swope, RN, C. Walsh, RN, A. Trela, RN, phlebotomist M. Ugur, and clinical research assistants A. Goel, K. Spann, R. Fleischmann, S. Batra, and I. Chang. Funding: This study was funded by NIH training grant T32 AI07290 (A.H.), NSF training grant DGE-114740 (D.M.S.-A.), NIH grants AI22039 (P.P.) and U19AI090019 and U19 AI057229 (M.M.D.), NIH/National Center for Research Resources Clinical and Translational Science Award UL1 RR025744, a New Investigator Award from the University of Washington Center for AIDS Research (P30 AI027757, C.A.B.), and a Beckman Young Investigator Award (C.A.B.). Author contributions: A.H. and C.A.B. designed and performed the experiments, analyzed the data, and wrote the manuscript; D.M.S.-A. analyzed the data, performed ecological analyses, and wrote the manuscript; M.L., N.N.-G., O.C.D., H.M., P.J.N., and L.A.G. performed the experiments and contributed to the writing of the manuscript; C.A.B., J.K., and M.D. performed statistical analyses; G.E.S. referred twin pairs for recruitment; C.L.D. and S.M. conducted the clinical trial to provide samples; G.E.S., C.L.D., and M.M.D. contributed to the writing of the manuscript; A.H., P.P., and C.A.B. conceived the experiments and wrote the manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data are deposited in ImmPort with access number SDY232.
View Abstract

Stay Connected to Science Translational Medicine

Navigate This Article