Meeting the Governance Challenges of Next-Generation Biorepository Research

See allHide authors and affiliations

Science Translational Medicine  20 Jan 2010:
Vol. 2, Issue 15, pp. 15cm3
DOI: 10.1126/scitranslmed.3000361


Advances in clinical translational research have led to an explosion of interest in infrastructure development and data sharing facilitated by biorepositories of specimens and linked health information. These efforts are qualitatively different from the single-center sample collections that preceded them and pose substantial new ethics and regulatory challenges for investigators and institutions. New research governance approaches, which can address current and anticipated challenges, promote high-quality research, and provide a robust basis for ongoing research participation, are urgently required.


Researchers have long made use of stored biospecimens (Fig. 1), and associated phenotypic and clinical data, to study the interplay of genetic and environmental influences on disease risk. Until recently, most biorepository research was based in resources created and maintained by single investigators or research teams, typically for the retrospective analysis of specific diseases, and used samples collected via established (often clinical) relationships with participants. Such “first generation” biorepository research is now giving way to forms of population-based investigation that require access to very large numbers of research participants who are followed prospectively for a wide range of traits and diseases (1, 2). These “next generation” biorepository-based approaches promise substantial scientific benefit and are being enabled by national funding initiatives, such as the Clinical Translational Science Awards, that aim to create large networks of collaborating institutions and correspondingly large national data sets (3). The specimen- and data sharing entailed by these new initiatives is unprecedented and, alongside the obvious logistical and technical issues, poses a range of important ethical and regulatory challenges for investigators and institutions (4, 5). Developing a productive environment for next-generation biorepository research will involve identifying and addressing these important research governance challenges.

Fig. 1. Frozen resources.

A researcher retrieves medical samples.


Here, we address four common areas of concern—privacy, institutional review, informed consent, and data stewardship—and discuss how biorepository governance will need to adapt to keep pace with evolving developments in translational science (Fig. 2).


Research participants often cite a fear of having their genetic or personal health information used against them should it fall into the wrong hands (6, 7). In much first-generation biorepository research, researchers chose to protect the confidentiality of contributors’ data by one of two methods: Anonymization (identity irreversibly severed to prevent any future re-identification) or de-identification (codification but with the retention of identifying information) of linked data and specimens ( 8). Although some evidence suggests that research subjects may not readily distinguish between the two approaches (9), there has been an administrative preference for anonymization as the primary means to protect privacy and limit potential misuse (10, 11). In the United States this preference derives, in part, from an Office of Human Research Protections policy that allows institutions to exempt from human subjects review research that uses data “recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects” [45 CFR 46.102(f)].

However, despite the best efforts of researchers to meet the demands of anonymization, a number of features of biorepository research have combined to make it increasingly hard to achieve in practice (12). These include the generation of dense genotypic information from biospecimens (13), linkage to richly detailed clinical data, and the use of sophisticated bioinformatics tools for data mining and amalgamation (14). The recent demonstration that individual participants’ identities can be determined from aggregate genotypic data (15, 16) has underlined further the infeasibility of most traditional approaches to anonymization.

Fig. 2. Changes to accompany next-generation biorepository research.

The transition from first- to next-generation biorepository research will require innovative new approaches to privacy, institutional review, informed consent, and data stewardship. IRB, institutional review board.


It is no longer clear that we can promise anonymity to participants, nor is it obvious that we should: There are compelling reasons to prefer the retention of coded identifiers to anonymization. First, anonymization interferes with the ability to recontact individuals or cohorts when clinically relevant information is obtained (2, 17). Second, anonymization renders withdrawal from future research, a key tenet of voluntary research participation, impossible (18). Third, when it is uncertain whether we can honor assurances regarding anonymity, it is arguably unethical to rely on such assurances at the time of recruitment (19). From a scientific standpoint too, the analytic utility of individual samples is maximized if participant health information is kept current, and this is typically only possible when identity is maintained. [The Vanderbilt BioVU model (11), in which anonymized samples are linked to a “synthetic derivative” of the electronic medical record, represents a possible exception.]

For next-generation biorepository research, anonymization will therefore no longer suffice as a means of protecting participants’ privacy, nor will it provide a satisfactory basis for forgoing research oversight, particularly when broad data sharing is anticipated. Instead, renewed attention to the control and retention of coded identifiers, combined with innovative approaches to data security and research oversight, will be required.


On the basis of current guidance, research using coded rather than anonymized data will necessarily be subject to institutional oversight and ongoing review. Some will view this inevitability as undesirable—even potentially disastrous—for the pursuit of next-generation biorepository research. Not only will many more individual research projects require review, but a sea change could entail new (potentially burdensome) interactions with participants and the restriction or delay of particular kinds of sensitive or risky research. However, many institutions already require review of research using biorepository data, whether in fact samples are anonymized, so a move to the routine use of coded de-identified data may not be as onerous as many fear. Indeed, investigators should regard oversight as an important opportunity to align their research activities with participants’ interests in, and expectations for, biorepository research.

In first-generation biorepository research, many participants were recruited by a known investigator for the purposes of advancing a specific form of, usually disease-related, research (2). Although secondary users of biorepository data might be at a physical or temporal remove from subject recruitment, the specific intentions of participants could be readily communicated by the originating investigator(s) or inferred from the types of data contributed to the resource and/or terms of consent. Next-generation biorepository research, by contrast, relies increasingly on amalgamating diverse data sources from participants recruited to independent, frequently geographically widely dispersed biorepositories, for a broad range of potential research uses (5). Even though wide sharing maximizes the range of scientific questions that can be addressed, these new research arrangements increase the distance between investigators and participants, interfering with researcher accountability (20). Explicit research oversight can help scientists keep participants’ objectives and interests firmly in view as they pursue projects long after the period of initial recruitment.

Although investigators should view ongoing oversight as ultimately beneficial to the successful pursuit of biorepository research, we concede that current regulatory and institutional review procedures are not well suited to next-generation approaches. Reform is urgently needed to support harmonization across institutions (achieved currently on a more ad hoc basis by a patchwork of cooperative understandings and Data Use Agreements) and streamline review in cases in which demonstrably similar research activities or questions are being pursued. Reform will help clarify researcher responsibilities while ensuring that the rights and interests of biorepository participants are recognized and actively promoted.


Ongoing research review also provides an important opportunity to revisit the nature of informed consent for biorepository participation. As noted above, many forms of first-generation biorepository research were invariably directed toward the investigation of specific diseases, with consent worded to reflect the anticipated research use (21). With the expansion of biorepository-related data sharing, and a desire to use stored specimens and data to address an increasingly wide range of secondary research questions, there has been a move to broaden the terms of use so that a single consent agreement will allow any secondary investigation consistent with the stored data. Although enormously convenient for investigators, such “blanket” consent approaches have been critiqued as providing an inadequate (i.e., effectively uninformed) basis for research participation (22, 23). Many individuals, including those from communities that have had bad prior experiences with biomedical research, will choose not to participate on such a basis, limiting the translational promise of next-generation approaches.

A range of alternative consent models has been proposed as a result of these issues. At one extreme have been arguments for consent processes that allow for the broadest possible scope of research use and data sharing but require enrollees to demonstrate awareness of genetic principles and the potential risks of biorepository participation (19). Others favor promoting autonomy with tiered consent, so that participants indicate specific preferences for their personal data, either with respect to the types of research use or with regard to the ways that data may be shared with third parties (10, 23, 24). Finally, others have called for a reconceptualization of consent as an ongoing process, in which researchers remain actively engaged in communicating with participants (7, 25). In the last model, it is not clear if ongoing interactions would necessarily require reconsent (26); instead, upfront agreement of participants to an explicit form of research oversight (including, ideally, direct participant representation) might be combined with timely notification when a review identifies potential new research risks.

Each alternative has advantages and disadvantages. Nevertheless, we believe that approaches to consent that allow for the ongoing involvement of research subjects will be key to both the recruitment and long-term retention of biorepository participants. A high priority for next-generation biorepository research will be developing new methods for recontacting participants, including creative use of electronic and other modes of communication (27).


Stewardship, or the careful and responsible management of something entrusted to one’s care, is central to the pursuit of research that uses stored specimens and data. Stewardship typically implies that everyone in the research workflow has a responsibility to protect human subjects’ interests and well-being to the best of his or her ability (28). In most first-generation biorepository research, the burden of stewardship fell to the originating investigator or institution and was achieved by faithfulness to the terms of informed consent and the adoption of data protections like anonymization. However, with the retention of identifying information, an expectation of ongoing oversight coordinated across independent institutions, and the need to maintain communication with participants in light of the open-ended nature of the research commitment, next-generation biorepository research entails far greater demands for stewardship and researcher accountability. These responsibilities may include taking due care with the analysis and sharing of confidential genetic and linked health information, the adoption of research goals consistent with the intentions of participants, and the avoidance of forms of dissemination (publications and similar) that promote harmful or derogatory conclusions about certain populations or groups (29).

Although recognizing these expectations is essential, formally meeting the stewardship requirements of next-generation research rests critically on the adoption of defined research governance mechanisms that are effective yet flexible enough to respond to dynamic scientific, technical, and policy developments (30). Terms of data access and use must be elaborated, and governance boards or data use committees [which preferably include participant representatives (31)] established to vet proposed research uses and oversee plans for data sharing and research dissemination. Some biorepositories, such as those run by advocacy groups like PXE International (32), which promotes research about the genetic disorder pseudoxanthoma elasticum, and the Autism Genetic Research Exchange (33), review all data use applications through the lens of how likely the proposed research is to advance the science and bring society one step closer to a cure or other treatment options. These groups require data sharing and dissemination plans to facilitate accountability and speed communication through both research and practice communities. Those undertaking larger-scale efforts will no doubt find it more challenging to define the common good and target research priorities accordingly, but the increased effort will be rewarded both by heightened trust in the research enterprise (34) and by an enhanced potential to contribute to near-term translational benefits.


It is time to acknowledge that first-generation technical and regulatory solutions are not up to the task of addressing the ethical and scientific challenges of next-generation biorepository research. Data anonymization is no longer achievable as a matter of practice, and greater attention to research review, informed consent, and ongoing stewardship of repository samples and data is urgently required instead. The pursuit of effective translational research rests as much on our willingness to meet our responsibilities to research participants as on our ability to advance cutting-edge methodology and analytical innovation. These priorities need not—and indeed cannot—be mutually exclusive.


  • Citation: S. M. Fullerton, N. R. Anderson, G. Guzauskas, D. Freeman, K. Fryer-Edwards, Meeting the governance challenges of next-generation biorepository research. Sci. Transl. Med. 2, 15cm3 (2010).

References and Notes

  1. This work was supported by a grant from the University of Washington (UW) Institute of Translational Health Sciences (National Center for Research Resources, UL1 RR 025014, KL2 RR 025015, and TL1 RR 025016). Additional support was provided by the UW Center for Ecogenetics and Environmental Health (National Institute of Environmental Health Sciences, P30 ES07033), the UW Center for Genomics and Healthcare Equality (National Human Genome Research Institute, P50 HG003374), and the Northwest Institute of Genetic Medicine (Washington State Life Sciences Discovery Fund). The views presented in this Commentary are those of the authors and do not necessarily reflect the views of these funding agencies.

Stay Connected to Science Translational Medicine

Navigate This Article