An Integrated Approach to Oversight Assessment for Emerging Technologies
Abstract
Analysis of oversight systems is often conducted from a single disciplinary perspective and by using a limited set of criteria for evaluation. In this article, we develop an approach that blends risk analysis, social science, public administration, legal, public policy, and ethical perspectives to develop a broad set of criteria for assessing oversight systems. Multiple methods, including historical analysis, expert elicitation, and behavioral consensus, were employed to develop multidisciplinary criteria for evaluating oversight of emerging technologies. Sixty-six initial criteria were identified from extensive literature reviews and input from our Working Group. Criteria were placed in four categories reflecting the development, attributes, evolution, and outcomes of oversight systems. Expert elicitation, consensus methods, and multidisciplinary review of the literature were used to refine a condensed, operative set of criteria. Twenty-eight criteria resulted spanning four categories: seven development criteria, 15 attribute criteria, five outcome criteria, and one evolution criterion. These criteria illuminate how oversight systems develop, operate, change, and affect society. We term our approach “integrated oversight assessment” and propose its use as a tool for analyzing relationships among features, outcomes, and tradeoffs of oversight systems. Comparisons among historical case studies of oversight using a consistent set of criteria should result in defensible and evidence-supported lessons to guide the development of oversight systems for emerging technologies, such as nanotechnology.
1. INTRODUCTION
U.S. approaches to oversight of research and technology have developed over time in an effort to ensure safety for humans, animals, and the environment; to control use in a social context; and, on occasion, to promote innovation. In modern times, regulatory and oversight tools have evolved to include diverse approaches such as performance standards, tradable allowances, consultations between government and industry, and premarket safety and efficacy reviews (Wiener, 2004; Davies, 2007). The decision whether to impose an oversight system, the oversight elements, the level of oversight (e.g., federal, state, local), the choice of approach (e.g., mandatory or voluntary), and its execution can profoundly affect technological development, individual and collective interests, and public trust and attitudes toward technological products (Rabino, 1994; Zechendorf, 1994; Siegrist, 2000; Cobb & Macoubrie, 2004; Macoubrie, 2005, 2006). Oversight is conducted by a range of institutions with various capabilities, cultures, and motives (e.g., Abraham, 2002). Avenues for disputing oversight decisions are also important, and some argue that the United States operates in an adversarial regulatory culture in which Congress, the media, and stakeholders regularly contest the decisions of federal agencies (Jasanoff, 1990).
Oversight for a new and emerging technology, nanotechnology, has recently been the subject of debate and analysis. Nanotechnology involves an enabling set of products, methods, and tools to conduct science, perform tasks or actions, and generate products at very small scales. Scholars and organizations are currently debating how nanotechnology and its applications should be overseen, even as products are rapidly entering the marketplace (Davies, 2006, 2007; Kuzma, 2006; Taylor, 2006; PEN, 2007). Nanotechnology has been defined as the “understanding and control of matter at dimensions of roughly 1 to 100 nanometers, where unique phenomena enable novel applications” (NNI, 2007). It has the potential to advance medicine, agriculture, health, and environmental science and provide great benefits to society (Lane & Kalil, 2005). However, there may be safety concerns related to the special properties of nanoparticles, such as their greater abilities to penetrate and translocate (i.e., move across membranes, organs, and cells) in biological systems (reviewed in Maynard, 2006). To the extent that there is governmental oversight as yet, the U.S. oversight system for nanotechnology currently relies on agencies and regulations that have provided oversight for related technologies, products, or functions, but may not be equipped to adequately handle the novel properties of and unique challenges that may be associated with certain nanoproducts (Davies, 2006, 2007; Kuzma, 2006; Taylor, 2006). Recent studies indicate that while the public is excited about nanotechnology and its potential benefits, there is concern about who is developing and promoting the technology, who will assess and manage the potential risks, who will be responsible for monitoring products after they hit the marketplace, and who will be liable for potential problems (Cobb & Macoubrie, 2004; Macoubrie, 2005, 2006; Pidgeon, 2006). The public is also concerned about long-term health and environmental effects and has cited negative experiences with past technologies as a reason to be cautious with nanotechnology (Macoubrie, 2005). It has been suggested that proper oversight systems can lead to greater public confidence, as well as increased technological innovation and success (Porter, 1991; Porter & van der Linde, 1995; Jaffe & Palmer, 1997).
Public confidence in and perceptions of risk from technological products are affected by more than just the quantitative level of risk. Peoples' attitudes are influenced by several factors, such as whether the risk is voluntary or involuntary, natural or man-made, controllable or uncontrollable, or familiar or unfamiliar (Rasmussen, 1981; Slovic, 1987). Emerging technologies that are unfamiliar to people and to which they are involuntarily exposed often fall into a category of high “dread” in risk perception studies (Slovic, 1987; Siegrist et al., 2007a). Public perception of nanotechnology in particular—including trust, perceived benefits, and general attitudes—is dependent on the specific application or product (Siegrist et al., 2007b). Current and potential products of nanotechnology are extremely diverse (PEN, 2007), and oversight systems will need to respond to this diversity. Such systems will need not only to review scientific information about human health and environmental risks of diverse products while promoting innovation and research, but also to be seen as legitimate, trustworthy, and able to handle many types of applications.
In order to systematically analyze oversight systems, make comparisons, and glean lessons for oversight of new and emerging technologies such as nanobiotechnology, we developed a broad set of multidisciplinary criteria that could be applied to characterize and evaluate any oversight system for technological products and applications. We divided these criteria into categories to explore the development, attributes, outcomes, and evolution of oversight systems. This article reviews the literature on which the criteria were based, describes the methodology for developing the criteria, and discusses the features of the set. It also describes the role of criteria-based oversight assessment within a broader approach, which we term “integrated oversight assessment” (IOA). Future work and publications will utilize the criteria and IOA to compare historical case studies of oversight, so that defensible and evidence-supported lessons to guide the development of oversight systems for emerging technologies, such as nanotechnology, can be identified.
2. OVERSIGHT ASSESSMENT APPROACHES
2.1. Previous Approaches
Analysis of oversight systems has historically been conducted through one or a few perspectives using a small set of criteria often focused on a particular discipline (e.g., U.S. EPA, 1983; OTA, 1995; Davies, 2007). Yet, oversight affects multiple stakeholders with various viewpoints, values, and concerns and should pass muster from policy, legal, economic, ethical, and scientific perspectives. Some stakeholders are most concerned about economic impacts or job opportunities that result or are lost with technological adoption. Others care primarily about the health and environmental impacts of new products. Most consumers value parameters that affect their daily life, such as improved health, lower costs, better local environments, convenience, and quality. Government regulators often focus on health risks, costs, and benefits (U.S. EPA, 1983; White House, 1993, as amended, 2007). From a global perspective, there are emerging concerns that risks and benefits of technological products be fairly distributed within and among nations (Singer et al., 2005). From an ethical perspective, evaluation of emerging technologies may raise issues of conflict with moral principles or values and questions of whether the oversight process respects them (Walters, 2004). Although not every group or individual viewpoint can be accommodated in an oversight system, in a democracy such as the United States, an oversight system should respond to a range of viewpoints, values, and concerns (Jasanoff, 1990; Wilsdon & Willis, 2004; MacNaghten et al., 2005).
Some groups are making progress in integrating criteria for analysis of oversight frameworks in systematic ways. For example, the Fast Environmental Regulatory Tool (FERET) was designed to evaluate regulatory options via a computerized template to “structure the basic integration of impacts and valuations, provide a core survey of the literature, incorporate uncertainty through simulation methods, and deliver a benefit-cost analysis that reports quantitative impacts, economics values, and qualitative elements” (Farrow et al., 2001, p. 430). FERET addresses distributional issues of who bears the costs and who receives the benefits of oversight, and uses sophisticated modeling techniques to incorporate uncertainty. However, it does not account for other important attributes of oversight systems, for example, those that affect public confidence, legitimacy, developer satisfaction, and technology development.
In contrast, a more qualitative oversight evaluation method is used in an article on the deliberations of the consortium to examine clinical research ethics (Emanuel et al., 2004). This diverse expert group used qualitative and normative approaches to identify 15 problems in the oversight of research involving human participants. It then sorted them into three categories—structural, procedural, and performance assessment problems—and evaluated whether proposed reforms would address those challenges. Identified problems in the oversight of human subjects research included the ability of the oversight system to be consistent, flexible, manage conflicts of interest, and provide for adequate education of participants (Emanuel et al., 2004). FERET's highly quantitative model at one extreme, and the Consortium's qualitative expert group consensus model at the other, show a range of approaches to evaluating oversight. However, the recent literature examining emerging technologies establishes that technology governance requires collaboration among scientists, government, and the public (Wiek et al., 2007). This suggests that oversight assessment should use a broad range of criteria that addresses the concerns of multiple stakeholders.
2.2. Basis of Our Approach
The goal of this study was to develop a multidisciplinary approach to more comprehensively evaluate oversight systems for emerging technologies. In our work, we define “oversight” broadly, as “watchful or responsible care” that can include regulatory supervision, or nonregulatory and voluntary approaches or systems (Kuzma, 2006). Our IOA approach is based in part upon multicriteria decision analysis (MCDA). MCDA relies on the notion that no single outcome metric can capture the appropriateness or effectiveness of a system, allows for integrating heterogeneous information, and enables incorporation of expert and stakeholder judgments (reviewed in Belton & Stewart, 2002). MCDA refers to a range of approaches in which multiple criteria are developed, ranked, and used to compare alternatives for decision making. General categories of criteria have been described, such as utility-based criteria (focusing on cost, risk-benefit comparison, and outcomes), rights-based criteria (focusing on whether people have consented to risk and their rights are being respected), and best available technology-based criteria (focusing on using the best technologies available to reduce risk to the extent possible) (Morgan & Henrion, 1990a). MCDA can be descriptively useful to better understand systems, stakeholder and expert views, and multiple perspectives on decisions. However, its normative utility is limited because its ability to predict or recommend the best decision or approach is unclear (Morgan & Henrion, 1990a).
MCDA has been used recently to evaluate strategies for risk management (e.g., remediating environmental hazards such as oil spills) (Linkov et al., 2006, 2007a). An MCDA approach was recently used to evaluate oversight approaches to three hypothetical nanomaterials by eliciting criteria and weightings from scientists and managers (Linkov et al., 2007b). Criteria used were health and ecological effects, societal importance, and stakeholder preference, and these were weighted according to their importance. However, since the products were hypothetical, the criteria were broad and few, and the authors of the study ranked the criteria themselves, the results were limited to demonstrating how MCDA could be applied to decisions about the environmental health and safety of nanomaterials. To our knowledge, MCDA has neither been applied to broader oversight policy questions for emerging technologies, nor has it incorporated comprehensive sets of criteria that address intrinsic values, rights, and fairness, as well as utilitarian outcomes.
Expanding on the general framework of MCDA, we employ several methods to devise criteria for evaluating and describing oversight systems, including review of the relevant literature, historical analysis, group consensus, and quantitative expert and stakeholder elicitation (Fig. 1). Fields of public policy (including social science and risk policy), social science, law, and ethics were considered to develop several types of criteria, including those relating to economics, social science, safety, values, and impacts on technology research and development. In our IOA approach, we consider viewpoints of different actors and stakeholders; the role of various programs, policies, and decisions; and diverse types of impacts. The breadth of our approach resulting from the integration of multiple disciplines, literatures, and methodologies makes it challenging. It is a more comprehensive approach than many others, but it is not exhaustive in nature. We acknowledge that it has limitations that stem in part from its reliance on literature analysis and the views of experts and stakeholders. Broad citizen surveys or public engagement exercises were not directly included. However, the public concerns and attitudes are represented in the literature we used to develop and refine the criteria. Despite the limitations, the multiple disciplines and methods employed in IOA make it a unique approach that is well equipped to understand many dimensions of oversight systems, depict the complexity of oversight, and aid in the design and implementation of more viable and robust systems. Our overall methodology is depicted in Fig. 1 and described in more detail in the following sections.

Integrated oversight assessment (IOA) methodology. The IOA approach combines multicriteria decision analysis, quantitative and qualitative analysis, and historical literature analysis, as described in this article.
3. DEVELOPING AND CATEGORIZING CRITERIA FOR OVERSIGHT ASSESSMENT
We developed our criteria through a multistage process by drawing upon the literature, conducting historical analysis, and using stakeholder and expert elicitation and consensus. In order to represent multiple disciplines in our assessment of oversight systems, we were inclusive in choosing criteria for initial consideration. Criteria characterized how the oversight system developed, its attributes, evolution, and outcomes. At this stage, we use the term “criteria” broadly as descriptive or evaluative and do not assume which criteria are able to predict good oversight or outcomes that a majority would believe to be positive (e.g., positive environmental impacts). However, because we were guided in our choice of criteria by what experts, stakeholders, and citizens value, many are likely to prove to be normatively important for assessing whether oversight is effective or appropriate. Future work and publications comparing across six historical case studies (gene therapy, genetically engineered organisms in the food supply, human drugs, medical devices, chemicals in the workplace, and chemicals in the environment) should allow us to identify what criteria are predictive of successful and appropriate oversight (Fig. 1). Judging which criteria fall into which category is not straightforward at this point, as we do not know which independent or descriptive variables will impact the dependent or evaluative ones (e.g., outcome criteria) most positively for oversight until the criteria are deployed across the case studies (see Section 5).
Initially, 66 criteria were identified from supporting literature (Table I and Appendix A).1 Searches were conducted using a variety of databases and resources2 to rigorously research the legal, ethics, and public policy literature regarding oversight, including materials on criteria utilized in oversight analysis. We strove to set up a methodology that would be conducive to generating and testing hypotheses about oversight systems. In order to probe relationships between features of oversight systems and important outcomes of them in future work (i.e., in application of the criteria to historical case studies, see Fig. 1), we categorized criteria into four groups—those associated with the initial development of the system (e.g., establishment of policies, procedures, or regulations); the attributes of the system (e.g., how the system operates for particular processes or decisions); the outcomes of the system (e.g., social, economic, cultural, health, environmental, and consumer impacts); and evolution of the system (e.g., changes to the development, attributes, or outcomes over time). We suspect that criteria within and among categories interrelate and affect each other. For example, we have hypothesized that the way in which the oversight mechanism develops and its attributes are related to outcomes such as public confidence, health, and environmental impacts, and economic effects on industry and stakeholders. Outcomes then spur further development and change attributes over time as the system evolves. Below, we discuss some of the supporting literature and resulting criteria also listed in Table I.
Guiding Question | Supporting Literature | ||
---|---|---|---|
Development | |||
d1 | Impetus | What were the driving forces? | Davies (2007) |
d2 | Clarity of technological subject matter | Are the technologies, processes, and products to be overseen well defined? | Davies (2007) |
d3 | Legal grounding | How explicit are the statutes or rules on which the oversight framework is based?Is it clear that the decisionmakers in the framework have legal authority for the actions they proposed at the time? Is there grounding in existing laws? | Fischoff et al. (1981); Slovic (1987); Jasanoff (1990); Porter (1991); OTA (1995); Frewer et al. (1996); NRC (1996); Jaffe & Palmer (1997); Siegrist (2000); Ogus (2002); Cobb and Macoubrie (2004); Frewer et al. (2004); Jasanoff (2005); Macoubrie (2005); Davies (2007) & Siegrist et al. (2007a, 2007b) |
d4 | Federal authority | How strong was the authority for federal actors? | Fischoff et al., 1981; Slovic, 1987; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Siegrist, 2000; Cobb & Macoubrie, 2004; Macoubrie, 2005; Davies, 2007; Siegrist et al., 2007a, 2007b |
d5 | Industry authority | How strong was the authority for industry actors? | Einsiedel & Goldenberg, 2004; Stewart & McLean, 2004; Thompson, 2007 |
d6 | Loci | How many loci of authority (e.g., industry, government, nonprofit, developers, scientists, clinicians) for oversight were included in the development stages? | Davies, 2007 |
d7 | Stakeholder input | Was there a process or opportunities for stakeholder contribution to discussions or decisions about on what the system is based, how it operates, or how it is structured? | Jasanoff, 1990; Einsiedel & Goldenberg, 2004; Stewart & McLean, 2004; Thompson, 2007 |
d8 | Breadth of input | To what extent were groups of citizens and stakeholders from all sectors of society encouraged to provide input to decisionmakers who devised the oversight framework?Were some groups missing? | Fischoff et al., 1981; Slovic, 1987; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Siegrist, 2000; Cobb & Macoubrie, 2004; Einsiedel & Goldenberg, 2004; Stewart & McLean, 2004; Macoubrie, 2005; Siegrist et al., 2007a, 2007b; Thompson, 2007 |
d9 | Opportunity for value discussions | How many and what kinds of opportunities did stakeholders and citizens have to bring up concerns about values or nontechnical impacts? | Fischoff et al., 1981; Slovic, 1987; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Beauchamp & Walters, 1999; Siegrist, 2000; Cobb & Macoubrie, 2004; Einsiedel & Goldenberg, 2004; Emanuel et al., 2004; Stewart & McLean, 2004; Macoubrie, 2005; Siegrist et al., 2007b; Thompson, 2007 |
d10 | Transparency | Were options that the agencies or other decision-making bodies were considering known to the public?Were studies about the pros and cons of these options available? | Fischoff et al., 1981; U.S. EPA, 1983; Slovic, 1987; Jasanoff, 1990, 2005; White House, 1993, amended 2007; Frewer et al., 1996; NRC, 1996; Siegrist, 2000; Cobb & Macoubrie, 2004; Frewer et al., 2004; Macoubrie, 2005; Siegrist et al., 2007a, 2007b |
d11 | Financial resources | How sufficient were the funds provided to the developers of the framework? | Emanuel et al., 2004; OTA, 1995; Davies, 2007 |
d12 | Personal education and training | How trained or educated were actors during the development stage of oversight? | Emanuel et al., 2004; Davies, 2007 |
d13 | Empirical basis | To what extent was scientific or other objective evidence used in designing the review or oversight process central to the framework? | Davies, 2007 |
Attributes | |||
a1 | Legal grounding | How explicit are the statutes or rules on which specific decisions within the oversight framework are based?Is it clear that the decisionmakers in the framework have legal authority for the actions they propose? | Fischoff et al., 1981; Slovic, 1987; Greene, 1990; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Cole & Grossman, 1999; Siegrist, 2000; Cobb & Macoubrie, 2004, 2005; Davies, 2007; Siegrist et al., 2007a, 2007b |
a2 | Data requirement | How comprehensive are the safety and other studies required for submittal to authorities?If the system is voluntary, how comprehensive are the data that are generated and available for review prior to decisions about release or approval? | Fischoff et al., 1981; U.S. EPA, 1983; Slovic, 1987; Greene, 1990; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Cole & Grossman, 1999; Siegrist, 2000; Cobb & Macoubrie, 2004; Macoubrie, 2005; Davies, 2007; Siegrist et al., 2007a, 2007b |
a3 | Treatment of uncertainty | Is uncertainty accounted for qualitatively or quantitatively in data and study submissions? | Fischoff et al., 1981; Slovic, 1987; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Siegrist, 2000; Cobb & Macoubrie, 2004; Macoubrie, 2005; Davies, 2007; Siegrist et al., 2007a, 2007b |
a4 | Stringency of system | Is the system mandatory or voluntary? | Fischoff et al., 1981; Slovic, 1987; Greene, 1990; Jasanoff, 1990, 2005; Porter, 1991; OTA, 1995; Frewer et al., 1996, 2004; NRC, 1996; Jaffe & Palmer, 1997; Cole & Grossman, 1999; Siegrist, 2000; Ogus, 2002; Cobb & Macoubrie, 2004, 2005; Davies, 2007; Siegrist et al., 2007a, 2007b |
a5 | Empirical basis | To what extent is scientific or other objective evidence used in making decisions about specific products, processes, or trials? | Fischoff et al., 1981; Slovic, 1987; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Siegrist, 2000; Cobb & Macoubrie, 2004; Macoubrie, 2005; Davies, 2007; Siegrist et al., 2007a, 2007b |
a6 | Compliance and enforcement | To what extent does the system ensure compliance with legal and other requirements and to what extent can it prosecute or penalize noncompliers? | Fischoff et al., 1981; Slovic, 1987; Greene, 1990; Jasanoff, 1990, 2005; OTA, 1995; Cole & Grossman, 1999; Frewer et al., 1996, 2004; NRC, 1996; Siegrist, 2000; Cobb & Macoubrie, 2004; Macoubrie, 2005; Davies, 2007; Siegrist et al., 2007a, 2007b |
a7 | Incentives | Are the stakeholders in the system encouraged to abide by the requirements of the system? | Davies, 2007 |
a8 | Treatment of intellectual property and proprietary information | How does confidential business information, trade secrets, or intellectual property get treated in applications for approval? | NRC, 2000; PIFB, 2003a; Einsiedel & Goldenberg, 2004; Stewart & McLean, 2004; Davies, 2007; Thompson, 2007 |
a9 | Institutional structure | How many agencies or entities with legal authority are involved in the process of decision making within the framework? | Porter, 1991; OTA, 1995; Jaffe & Palmer, 1997; Ogus, 2002; Davies, 2007 |
a10 | Feedback loop | Can something discovered in later phases of product review be used to improve early review stages for other, the same, or modified products/trials in the future? | OTA, 1995; Emanuel et al., 2004; Davies, 2007 |
a11 | Formal assessment | Is there a regular mechanism for internal or contracted studies on the quality of the system? | Emanuel et al., 2004 |
a12 | Postmarket monitoring | Is there a science-based and systematic process for detecting risks and benefits after commercial release, field trials, or clinical trials? | Emanuel et al., 2004; Davies, 2007 |
a13 | Industry navigation | How easily can small, medium, and large companies navigate the oversight system? | U.S. EPA, 1983 |
a14 | Actors involved | Is there a wide range of perspectives and expertise involved in decision making for projects, trials, or processes? | Davies, 2007 |
a15 | Flexibility | Can products or trials undergo expedited review when appropriate?Can products or trials be easily stopped when information on potential risks is presented? | OTA, 1995 |
a16 | Capacity | Is the system well prepared and equipped to deal with the approvals of trials, products, or processes? | OTA, 1995; Davies, 2007 |
a17 | Relationship among actors | Are the various actors confrontational to each other in interactions?Are there efforts to work together to understand differences? | Davies, 2007 |
a18 | Stakeholder input | Is there a process or opportunities for stakeholders to contribute to discussions or decisions about whether certain products, processes, or trials should be approved? | Fischoff et al., 1981; Slovic, 1987; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Siegrist, 2000; Cobb & Macoubrie, 2004; Einsiedel & Goldenberg, 2004; Stewart & McLean, 2004; Macoubrie, 2005; Siegrist et al., 2007a, 2007b; Thompson, 2007 |
a19 | Breadth of input | To what extent are groups of citizens and stakeholders from all sectors of society encouraged to provide input to decisionmakers about specific actions or approvals?Are some groups “heard” more than others? | Fischoff et al., 1981; Slovic, 1987; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Siegrist, 2000; Cobb & Macoubrie, 2004; Einsiedel & Goldenberg, 2004; Stewart & McLean, 2004; Macoubrie, 2005; Thompson, 2007; Siegrist et al., 2007a, 2007b |
a20 | Opportunities for value discussion | How many and what kinds of opportunities do stakeholders and citizens have to bring up value concerns? | Fischoff et al., 1981; Slovic, 1987; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Beauchamp & Walters, 1999; Siegrist, 2000; Cobb & Macoubrie, 2004; Einsiedel & Goldenberg, 2004; Emanuel et al., 2004; Stewart & McLean, 2004; Macoubrie, 2005; Thompson, 2007; Siegrist et al., 2007a, 2007b |
a21 | Consideration of fairness | Is the system attentive to the distribution of costs/risks and benefits?Is the system attentive to equal opportunities for all to benefit from its decisions? | Jasanoff, 1990, 2005; OTA, 1995; Beauchamp & Walters, 1999 |
a22 | Transparency | Are options that agencies or other decision-making bodies are considering known to the public? Are studies about the pros and cons of these options available? Is the process for how decisions are made clearly articulated to interested parties? | Fischoff et al., 1981; U.S. EPA, 1983; Slovic, 1987; Jasanoff, 1990, 2005; White House, 1993, as amended, 2007; Frewer et al., 1996, 2004; NRC, 1996, 2000; Beauchamp & Walters, 1999; Siegrist, 2000; PIFB, 2003a; Cobb & Macoubrie, 2004; Macoubrie, 2005; Davies, 2007; Siegrist et al., 2007a, 2007b |
a23 | Conflict of interest | Do independent experts conduct or review safety studies? Are conflicts of interest disclosed routinely? | Einsiedel & Goldenberg, 2004; Emanuel et al., 2004; Stewart & McLean, 2004; Davies, 2007; Thompson, 2007 |
a24 | Conflict of views | How are conflicting views handled in the review of products, processes, and trials? | |
a25 | Economic costs and benefit considered | What role does cost-benefit analysis play in approvals? | U.S. EPA, 1983; OTA, 1995 |
a26 | Accountability and liability | Is there a fair and just system for addressing product or trial failures with appropriate compensation to affected parties and/or environmental remediation? | Davies, 2007 |
a27 | Education of decisionmakers, stakeholders | To what extent does the system make efforts to educate the interested and affected parties, as well as the decisionmakers? | Emanuel et al., 2004; Davies, 2007 |
a28 | Informed consent | To what extent does the system supply the amount and type of information so that people can make informed decisions about what they will accept? | Fischoff et al., 1981; Slovic, 1987; Jasanoff, 1990, 2005; Frewer et al., 1996, 2004; NRC, 1996; Beauchamp & Walters, 1999; Siegrist, 2000; Cobb & Macoubrie, 2004, 2005; Siegrist et al., 2007a, 2007b |
a29 | International harmonization | How well does the system match up with other systems around the world? | Newell, 2003 |
Evolution | |||
e1 | Extent of change in attributes | To what extent has the system changed over time? | OTA, 1995 |
e2 | Distinguishable periods of change | Can separable periods in the oversight history be distinguished? | |
e3 | Extent of change in attributes | To what extent have the system's attributes changed over time? | OTA, 1995 |
e4 | Change in stakeholder satisfaction | To what extent have stakeholder opinions changed during the evolution of the system? | |
e5 | Public confidence | To what extent have public opinions changed during the evolution of the system? | OTA, 1995 |
Outcomes | |||
o1 | Product safety | What is the number of adverse reports compared to the number of approvals? | OTA, 1995 |
o2 | Time and costs for market approval | How long does it take and how much does it cost for approval? | U.S. EPA, 1983; OTA, 1995; Emanuel et al., 2004 |
o3 | Recalls | What is the number of recalls compared to the number of approvals? | OTA, 1995; Davies, 2007 |
o4 | Stakeholder satisfaction | How well do stakeholders and experts regard the system? | Beauchamp & Walters, 1999 |
o5 | Public confidence | What do the public or citizens think about the system?How about disadvantaged, special, or susceptible populations? | Porter, 1991; Rabino, 1994; OTA, 1995; Fewer et al., 1996; Siegrist, 2000; Macoubrie, 2005, 2006 |
o6 | Effects on social groups | Are the net effects of approvals positively affecting the vast majority of social groups? | Jasanoff, 2005 |
o7 | Cultural effects | Are the net effects of approvals positively affecting people and their cultures? | OTA, 1995; Beauchamp & Walters, 1999; Jasanoff, 2005 |
o8 | Research impacts | Has the system enhanced and supported research either on environmental health and safety or in the development of products? | Porter, 1991; OTA, 1995; Jaffe & Palmer, 1997; Ogus, 2002 |
o9 | Innovation | Has the system led to more innovation in the field or stifled it? | U.S. EPA, 1983; Greene, 1990; Porter, 1991; OTA, 1995; Cole & Grossman, 1999; Rabino, 1994; Jaffe & Palmer, 1997; Siegrist, 2000; Ogus, 2002; Macoubrie, 2005, 2006 |
o10 | Health | Does the oversight system impact health in positive ways? | U.S. EPA, 1983; White House, 1993, as amended, 2007; Beauchamp & Walters, 1999 |
o11 | Distributional health impacts | Are the health impacts equitably distributed? Is there an inequitable impact on specific social or disadvantaged groups? | U.S. EPA, 1983; White House, 1993; OTA, 1995; Beauchamp & Walters, 1999 |
o12 | Environmental impacts | Does the oversight system impact the environment in positive ways? | U.S. EPA, 1983; Greene, 1990; White House, 1993; OTA, 1995; Cole & Grossman, 1999 |
o13 | Nonindustry economic impacts | How does the system impact nonindustry stakeholder groups economically? | U.S. EPA, 1983; OTA, 1995; Beauchamp & Walters, 1999; Jasanoff, 2005 |
o14 | Effects on big corporations | How are big companies doing, financially and otherwise, as a result of the system? | U.S. EPA, 1983; Porter, 1991; Rabino, 1994; OTA, 1995; Jaffe & Palmer, 1997; Siegrist, 2000; Ogus, 2002; Newell, 2003; Macoubrie, 2005, 2006 |
o15 | Effects on small- to medium-sized enterprises (SMEs) | Are SMEs disadvantaged as a result of the oversight system? Are they suffering? | U.S. EPA, 1983; Porter, 1991; Rabino, 1994; OTA, 1995; Jaffe & Palmer, 1997; Siegrist, 2000; Ogus, 2002; Newell, 2003; Macoubrie, 2005, 2006 |
o16 | Economic development | Does the approval of the products or trials improve the overall economic situation of the nation? | U.S. EPA, 1983; Porter, 1991; Rabino, 1994; OTA, 1995; Jaffe & Palmer, 1997; Siegrist, 2000; Ogus, 2002; Macoubrie, 2005, 2006 |
o17 | Global competitiveness for the United States | Does the oversight system disadvantage the United States in the global marketplace? | U.S. EPA, 1983; OTA, 1995; Newell, 2003 |
o18 | Distributional economic impacts | Does the approval of the products or trials improve the economic situation of rural or developing world citizens? | U.S. EPA, 1983; Porter, 1991; OTA, 1995; Jaffe & Palmer, 1997; Ogus, 2002 |
o19 | Proposals for change | Have there been proposals for change resulting from the oversight system? | OTA, 1995 |
- Note: Examples of supporting literature are listed for most of the initial 66 criteria. References to authors as supporting literature reflect our interpretation of that literature. Although the authors of the literature referenced may not have explicitly stated that particular criterion for evaluation of an oversight system, their work supports the importance of that criterion.
Economic criteria for evaluating oversight systems are prominent in the literature. For example, in the United States, the federal government often formally evaluates oversight systems based on costs and benefits through regulatory impact assessment (RIA) and economic analyses (U.S. EPA, 1983, 2000). Oversight systems often originate with statutory systems that are then detailed and implemented by regulatory agencies by formal notice and comment rule-making. RIA and economic analyses focus on the benefits and costs of proposed rules and decisions made under those regulatory systems (U.S. EPA, 1983, 2000). Proposed rules often are key aspects or implementations of oversight systems. Thus, cost effectiveness is embedded in several of our criteria, particularly those related to the attributes and outcomes of the system (Table I and Appendix A,3 e.g., a25, a2, a13, o2, o9, o13, o14, o15, o16, o17, o18). Executive Order 12,866 suggests somewhat broader criteria, requiring that every new regulation be subjected not only to a cost-benefit test, but also that analysis include: (1) evaluation of the adverse side effects of regulations on health and the environment, (2) qualitative assessment of distributional impacts, and (3) assurances of transparency (White House, 1993, as amended, 2007). Based on these government documents, we included transparency in both the development and execution of oversight systems (d10, a22), the consideration of distributional health impacts (o11), and health and environmental impacts (o10, o12).
Other criteria for the analysis of oversight systems have been described as part of MCDA (Morgan & Henrion, 1990a; Linkov et al., 2006, 2007a, 2007b), but deal more broadly with the system as a whole as opposed to particular decisions (e.g., OTA, 1995; Davies, 2007). The Office of Technology Assessment used seven criteria to assess regulatory and nonregulatory environmental policy tools and these appear in similar forms in our criteria list (OTA, 1995): cost-effectiveness and fairness (a21, a25, o2, o11, o12, o13, o14, o15, o18), minimal demands on government (d11, a9, a16, a25, o18), assurance to the public that environmental goals will be met (e5, o1, o5, o5, o12), prevention of hazards and exposure when possible (a4, a6, d3), consideration of environmental equity and justice issues (a21, o7, o11, o12), adaptation to change (a15, a10, e1, e3, o19), and encouragement of technology innovation and diffusion (o1, o3, o8, o9, o14, o15, o16, o17, o18).
The criteria we chose also overlap with many of the criteria that Davies (2007) suggests for oversight of nanotechnology by the EPA: incentives for industry to do long-term testing (a1, a3, a4, a5, a7, a12, a26), monitoring capabilities (a10, a12,), legal authority (d3, a1), empirical basis (d13, a5), resources for agencies (d4, d11, d12, a16, a27), clarity of materials to be regulated (d2, a1), recall authority (a4, a10, a12, o3), burden of proof on manufacturers (a2, a6, a26), data requirements (a2), prohibition of marketing (a1, a6), timely adverse events reporting (a10, a12), transparency in safety review (a8, a22), incentives for risk research (a2, a7, o8), proper institutional structures (d6, a9, a14), power relationships and impacts on oversight (a14, a17, a23), and political will (d1).
Our oversight criteria also address the fact that oversight systems can affect the competitiveness of the nation, particularly in the context of trade and World Trade Organization (WTO) agreements (Newell, 2003). Trade can be affected in positive or negative ways due to different standards or testing requirements. For example, U.S. grain exporters have lost hundreds of millions of dollars in trade with the European Union (EU) because U.S. varieties of genetically engineered food crops not approved in the EU are not segregated from other varieties in the United States (Paarlberg, 2002). These broader economic considerations and international harmonization of oversight were also included in our initial list of 66 criteria (a29, o14, o15, o17). Some analysts have hypothesized that mandatory regulatory systems with clear standards can foster innovation, ultimately improving the economic performance of firms (Porter, 1991; Jaffe & Palmer, 1997; Ogus, 2002). Yet, other studies indicate that regulations can decrease research productivity, particularly for smaller firms (Thomas, 1990). In our criteria, the legal grounding (d3), stringency of the system (a4), institutional structure (a9), economic impacts (o14, o15, o16, o18), and effects on research and innovation (o8, o9) were included to explore these relationships in our historical case studies. There is also evidence that mandatory systems lead to better attributes and outcomes as far as compliance, innovation, and environmental impacts (e.g., Greene, 1990; Cole & Grossman, 1999). Criteria (a1, a2, a4, a6, o9, o12) were included to explore these relationships as well.
Several criteria relating to what oversight features citizens believe to be important were derived from the public engagement and risk perception literature. That literature shows that citizens may appreciate transparency (d10, a22), exercising rights to know and choose (a22, a28), opportunities for meaningful input not limited to the quantitative risk (d8, d9, a18, a19, a20), and mandatory requirements for safety testing and regulation (d3, d4, a1, a2, a3, a4, a5, a6) (Fischoff et al., 1981; Slovic, 1987; Jasanoff, 1990, 2005; Frewer et al., 1996; NRC, 1996; Siegrist, 2000; Cobb & Macoubrie, 2004; Frewer et al., 2004; Macoubrie, 2005; Siegrist et al., 2007a, 2007b). Rigorous oversight can foster consumer or public confidence and trust (o5) and ultimately the success of beneficial technologies (o9, o14, o15, o16) (Porter, 1991; Rabino, 1994; Siegrist, 2000; Macoubrie, 2005, 2006).
Other criteria were based on the social science, ethics, and science and technology studies literature. For example, impacts from oversight systems for genetically engineered organisms have been documented to include changes in industry structure, farmer relationships, and cultural systems (o6, o7, o13) (Jasanoff, 2005). Power relationships and trust are influenced by the treatment of intellectual property (a8), the involvement of industry in decisionmaking and safety testing (d5, a23), and whether there are opportunities for wider public input (d8, d7, d9, a18, a19, a20). These factors may affect the public legitimacy of decisions made about new technological products (e.g., Einsiedel & Goldenberg, 2004; Stewart & McLean, 2004; Thompson, 2007). Confidential business information (CBI) and treatment of intellectual property (a8) have affected the ability of stakeholders and scientists outside of industry to access information about technological products before, during, and after regulatory review; thus, we include transparency among the criteria (a22) (PIFB, 2003a; NRC, 2000). Transparency has been proposed as a precondition to public trust and confidence (o5), although it is not itself sufficient for trust (Frewer et al., 1996). Also, transparency and public consultation (d7–d9; a18–a20) enhance the credibility of oversight systems if input is considered carefully by decisionmakers and not ignored (e.g., a21) (Jasanoff, 1990). Cash et al. (2003) suggest that the salience, credibility, and legitimacy of information produced from systems for managing boundaries between knowledge creation and action (like oversight systems) are enhanced by communication, mediation, and translation among decisionmakers, the public, and stakeholders during decision making. Several of our criteria relate to their ideas, such as the inclusion of diverse stakeholders and opportunities for public input at key junctures in oversight systems (d7–d10, a14, a17–a22, a27–a28).
The ethics literature is reflected in principles of equity, justice, rights to know and choose, and beneficence or the minimization of harm (d9, a20, a21, a22, a28, o4, o7, o11, o13, o10, o13) (Beauchamp & Walters, 1999). For clinical trial oversight systems, the ability to address major ethical issues (d9, a20), do so in a timely manner (o2), manage conflicts of interest (a23), educate decisionmakers (d12, a27), provide sufficient resources (d11), report adverse events in a timely fashion (a12), and conduct formal assessments (a10) have been identified as key attributes (Emanuel et al., 2004).
Criteria within a category and among the four categories (i.e., development, attributes, evolution, and outcome) are not mutually exclusive. Given our approach to capture the evolution, operation, and adaptation of systems, there is some overlap in our list (Table I). For example, economic development outcomes (o16) cannot be separated from effects on large corporations (o14). Similarly, health (o10) and environmental impacts (o12) often cannot be fully distinguished (i.e., environment affects human health). A given criterion may be reflected in more than one category. For example, transparency appears both in the development and attributes category, reflecting its importance both in establishing oversight systems and in making particular decisions about products or applications of technologies (d10, a22).
4. EXPERT AND STAKEHOLDER ELICITATION FOR IDENTIFYING KEY CRITERIA
The 66 criteria described in the previous section were too numerous to be analytically tractable for future work on historical analysis of oversight for emerging technologies. Thus, we assembled a panel of experts and stakeholders as a Working Group to seek their input and consensus on criteria to be used in the six historical case studies (Fig. 1). The 12 Working Group members, by disciplinary background, expertise, and type of affiliation, respectively, included: cell biology, nanobiotechnology, academe; health policy, law, academe; medicine, biochemistry, small industry; business, food, large industry; applied economics, regulation, academe; environmental law, academe; regulatory policy, law, consumer organization; toxicology, public policy, nongovernmental organization (NGO); environmental policy, sociology, academe; science communication, sociology, academe; mechanical engineering, nanoparticles, academe; and engineering and public policy, environmental policy, academe. The Working Group agreed that it was necessary to refine the number of criteria to a manageable set.
The members of the Working Group all met several well-established conditions to qualify as “experts.” These conditions include substantive contributions to the scientific literature (Wolff et al., 1990), status in the scientific community, membership on editorial committees of key journals (Siegel et al., 1990; Evans et al., 1994), membership on advisory boards, and peer nomination (Hawkins & Evans, 1989). The Working Group provided a variety and balance of institutional perspectives. Some members represent stakeholder groups that are interested in or affected by historical models of oversight or nanotechnology oversight. Most members have had extensive experience with oversight systems and federal regulatory frameworks in one or more of the six areas and all have had some experience with them.
The Working Group was assembled, presented with the criteria list derived from the literature (Table I; Appendix A available online), and asked to arrive at consensus on what criteria were important for oversight assessment. The derivation of consensus among the panel members was approached from two complementary angles: behavioral and mathematical. Behavioral approaches generally rely on psychological factors and interactions among experts. Behavioral approaches are the dominant means of achieving consensus in bioethics, law, and public policy groups. In bioethics, for example, multidisciplinary dialogical consensus-building is standard, and federal and state committees and professional societies have long used this method to generate consensus. Moreno (2004) describes a consensus process that has worked successfully in addressing bioethical and policy problems in which members of the group approach the issues with openness, analyze the problem from a range of perspectives (usually ethical, legal, policy, scientific, and medical), articulate the arguments in favor of alternative positions, and work toward agreement. During the course of a two-day meeting with our Working Group, this process was used and aided substantially by our prior analysis of the literature and synthesis of candidate criteria.
Mathematical schemes designate a functional aggregation rule that accepts inputs from each expert and returns an arbitrated consensus (Winkler, 1968, 1986; Genest & Zidek, 1986). Expert elicitation has been typically used to estimate uncertain quantities (Morgan & Henrion, 1990b). There is not one “best” way to conduct an expert elicitation; however, attributes of good protocols include flexibility in approach, introduction of the expert to the general task of elicitation, focus on the subject matter to be judged, and good definition of the quantity (or in this case, the oversight criteria) that is to be elicited (Morgan & Henrion, 1990b). For our quantitative approach, we used a version of expert elicitation with the goal of gaining empirical information about what criteria are important for oversight assessment. We followed these principles by remaining flexible in incorporating feedback from the Working Group members up to the elicitation; spending a day prior to the elicitation to give background on the subject matter (e.g., reviewing the six historical case studies of oversight and emerging issues in nanotechnology oversight); providing a primer on expert elicitation before the exercise; and defining each criterion with not only a description of what it is, but also an example interpretation of that criterion and guiding question to help with the ranking of it (Appendix A).
Behaviorally derived agreements often suffer from problems of personality and group dynamics. Mathematical approaches avoid these problems but introduce their own set, as numerically dictated compromises may be universally unsatisfactory. We chose to use both types of approaches to strengthen the quality of the input from the Working Group. The behavioral approach was used to make adjustments to criteria, add or reword criteria, and glean general principles of good oversight. The mathematical approach involved a quantitative expert elicitation process whereby the Working Group members were asked to assign values, or probabilities, indicating how important each criterion was for the evaluation of oversight models (Appendix A).
For the elicitation, we asked each member to assess the importance of each criterion for oversight assessment. The question of “How important is it to consider this criterion in our oversight case studies?” was posed. The members were asked to rank the importance of each criterion for oversight assessment, based on their experience and knowledge, on a scale from 0 to 100 with the option of referring to qualitative descriptions of different probability levels. These levels included: Certain (100); Near Certain (80–99); Probable, Likely, We Believe (60–80); Even Chance (40–60); Less than an Even Chance (20–40); Improbably, Probably Not, Unlikely, Near Impossibility (1–20); Impossible (0). Twelve members of the Working Group participated in the exercise. STATA and Excel software were used to analyze the results from the elicitation. A subsequent data report included summaries of responses as histograms as well as means and median values and standard deviations for each criterion (Table II).
Development | Mean | Median | SD | |
---|---|---|---|---|
Development | ||||
d1 | Impetus | 80 | 83 | 12 |
d2 | Clarity of technological subject matter | 76 | 80 | 18 |
d3 | Legal grounding | 69 | 70 | 16 |
d4 | Federal authority | 70 | 77 | 19 |
d5 | Industry authority | 65 | 75 | 26 |
d6 | Loci | 62 | 60 | 27 |
d7 | Stakeholder input | 81 | 88 | 16 |
d8 | Breadth of input | 71 | 67 | 16 |
d9 | Opportunity for value discussions | 59 | 57 | 17 |
d10 | Transparency | 80 | 85 | 18 |
d11 | Financial resources | 69 | 75 | 20 |
d12 | Personnel education and training | 66 | 73 | 23 |
d13 | Empirical basis | 69 | 78 | 23 |
Attributes | ||||
a1 | Legal grounding | 74 | 78 | 17 |
a2 | Data requirement | 82 | 84 | 10 |
a3 | Treatment of uncertainty | 81 | 72 | 16 |
a4 | Stringency of system | 78 | 83 | 15 |
a5 | Empirical basis | 82 | 84 | 11 |
a6 | Compliance and enforcement | 83 | 90 | 13 |
a7 | Incentives | 73 | 84 | 21 |
a8 | Treatment of intellectual property and proprietary information | 74 | 80 | 22 |
a9 | Institutional structure | 62 | 68 | 23 |
a10 | Feedback loop | 69 | 75 | 22 |
a11 | Formal assessment | 63 | 70 | 24 |
a12 | Postmarket monitoring | 76 | 82 | 19 |
a13 | Industry navigation | 71 | 78 | 22 |
a14 | Actors involved | 70 | 75 | 17 |
a15 | Flexibility | 76 | 81 | 15 |
a16 | Capacity | 79 | 81 | 13 |
a17 | Relationship among actors | 63 | 65 | 24 |
a18 | Stakeholder input | 70 | 77 | 23 |
a19 | Breadth of input | 59 | 60 | 28 |
a20 | Opportunities for value discussion | 53 | 50 | 23 |
a21 | Consideration of fairness | 74 | 78 | 17 |
a22 | Transparency | 82 | 88 | 17 |
a23 | Conflict of interest | 86 | 90 | 9 |
a24 | Conflict of views | 70 | 72 | 20 |
a25 | Economic costs and benefits considered | 66 | 74 | 26 |
a26 | Accountability and liability | 72 | 72 | 13 |
a27 | Education of decisionmakers, stakeholders | 65 | 70 | 21 |
a28 | Informed consent | 82 | 88 | 14 |
a29 | International harmonization | 66 | 68 | 23 |
Evolution | ||||
e1 | Extent of change in attributes | 59 | 77 | 30 |
e2 | Distinguishable periods of change | 48 | 50 | 30 |
e3 | Extent of change in attributes | 58 | 62 | 21 |
e4 | Change in stakeholder satisfaction | 60 | 62 | 22 |
e5 | Public confidence | 68 | 70 | 15 |
Outcomes | ||||
o1 | Product safety | 71 | 78 | 26 |
o2 | Time and costs for market approval | 69 | 75 | 25 |
o3 | Recalls | 57 | 62 | 28 |
o4 | Stakeholder satisfaction | 70 | 70 | 15 |
o5 | Public confidence | 76 | 80 | 18 |
o6 | Effects on social groups | 57 | 60 | 18 |
o7 | Social, ethical, and cultural effects | 63 | 60 | 21 |
o8 | Research impacts | 86 | 90 | 13 |
o9 | Innovation | 74 | 85 | 28 |
o10 | Health | 85 | 88 | 12 |
o11 | Distributional health impacts | 80 | 82 | 14 |
o12 | Environmental impacts | 82 | 84 | 15 |
o13 | Nonindustry economic impacts | 75 | 78 | 18 |
o14 | Effects on big corporations | 56 | 62 | 28 |
o15 | Effects on small- to medium-sized enterprises | 58 | 68 | 29 |
o16 | Economic development | 61 | 68 | 25 |
o17 | Global competitiveness for the United States | 64 | 65 | 20 |
o18 | Distributional economic impacts | 61 | 62 | 20 |
o19 | Proposals for change | 72 | 70 | 16 |
- Note: Gray boxes refer to criteria that were eliminated according to consensus cut-off scores: over eight experts (>70%) rating the criterion as 70 or higher.
Following the elicitation exercise, we used both the quantitative results from it and behavioral consensus approaches with the authors and Working Group to derive a streamlined set of key criteria for oversight assessment. Project staff and the Working Group had initially agreed upon a target number of approximately 20 criteria for evaluations of the six case studies. The Working Group believed that this number would reduce the list of criteria to a manageable level for analysis while retaining a good degree of breadth and coverage. Thus, we chose a cut-off score from the expert elicitation that would reduce the number of criteria to approximately 20. We selected criteria for which over eight of the members (>70%) gave a score of at least 70 (out of 100). This dropped 42 criteria from the list, with 24 remaining (Table II).
Results of the elicitation indicated that mechanics of oversight were important to the Working Group: compliance and enforcement, incentives, institutional structure, flexibility, and capacity remained on the list of 24 criteria. Additionally, public confidence in oversight was rated highly by the Working Group as an important outcome of oversight systems. However, criteria associated with economic impacts on industry ranked lower than we expected from our expert and stakeholder group, which contained members from corporations and academic researchers who work with industry to develop new technological products or applications. While the literature reflects an emphasis on the need for oversight systems to reduce burdens on developers of products or applications (e.g., OTA, 1995; IRGC, 2006), this group rated economic outcomes of oversight lower than most other criteria (Table II). This result could reflect different industry viewpoints in our Working Group with respect to the community at large.
To derive a final list of criteria, we used a behavioral consensus approach to combine the quantitative elicitation results with our knowledge of the literature and qualitative Working Group input. We examined the quantitative results for each criterion carefully. Recognizing the imperfections of expert elicitation and mathematical approaches to consensus, we reinstated and combined a few criteria, and revised the description of several based on feedback from the Working Group following the ranking exercise. Five criteria that did not make the consensus cutoff (70% of experts over 70), but that we felt were important and that had relatively high means and medians (Table II) were fully reinstated—legal grounding (d3), institutional structure (a9), postmarket monitoring (a12), stakeholder input (a18), and extent of change (e1) (Table II). Impacts on innovation (o9) did not meet the consensus (70% over 70) cutoff, but it also had a relatively high mean and median (74 and 85 respectively) and has been viewed in the literature as an important outcome affected by oversight (e.g., OTA, 1995; Jaffe & Palmer, 1997). Therefore, it was reinstated and combined with impacts on research (o8) to address the overlap between the two. We also merged data requirements (a2) with stringency of the system (a4) to address Working Group input about the similarity between these two.
Twenty-eight criteria remained in our final set (Table III). Seven development criteria were retained that apply to the formal process of developing laws, rules, standards, guidance documents, programs, and policies that relate to the overall process for considering individual products, processes, or chemical trials: impetus, clarity of technological subject matter, legal grounding, public input, transparency, financial resources, and empirical basis. Fifteen attributes criteria were retained that apply to the process, whether formal or informal, of making decisions about specific products, subcategories of products, clinical trials, or other ways in which the framework is implemented: legal basis (formerly legal grounding), data requirements and stringency, postmarket monitoring, treatment of uncertainty, empirical basis, compliance and enforcement, incentives, treatment of intellectual property and proprietary information, institutional structure, flexibility, capacity, public input, transparency, conflicts of interest, and informed consent. One evolution criterion was retained to capture how the system has changed over time and why: extent of change in attributes. Five outcome criteria were retained that apply to the assessment of the impacts of decisions stemming from the oversight framework: public confidence, research and innovation, health and safety, distributional health impacts, and environmental impacts.
Description and Guiding Question(s) | |
---|---|
Development: 7 Criteria (D) | |
1. Impetus | Historical context and driving forces behind the system of oversight or the reasons for developing the basic framework for the oversight system. The guiding question here is: “What were the driving forces?” Examples could include intense public or legal pressure, key technological developments, or in response to an adverse event (reactive) or emerging concerns about the potential risks and benefits prior to legal or public pressure, technological developments, or any release or adverse event (proactive). |
2. Clarity of technological subject matter | Clarity or definition of the technologies, processes, and products to be overseen. The guiding question here is: “Are the technologies, processes, and products to be overseen well-defined?” Examples could include that the technologies, processes, and products as the subject of oversight are unclear or ill defined (not clear) or that they are well defined and it is clear what falls into oversight (clear). |
3. Legal grounding | Basis for development and the clarity of the statutes or rules for implementing the newly developed framework and achieving its goals. The guiding questions are: “How explicit are the statutes or rules on which the oversight framework is based? Is it clear that that the decisionmakers in the framework have legal authority for the actions they proposed? Is there grounding to existing laws?” Examples could be that there is considerable ambiguity and room for interpretation in executing the policies in the oversight framework (weak) or that there is little ambiguity about what the agencies can or cannot do (strong). |
4. Public input | Inputs shaping the creation of the oversight system, and the extent of opportunities for engaged stakeholders including nongovernmental organizations, trade associations, academics, industry, and other affected groups, to provide input into the development of the initial framework. This includes input on both scientific questions, as well as values questions (social, cultural, and ethical). The guiding question is: “Was there a process or opportunities for stakeholders to contribute to discussions or decisions about the basis of the system, how it operates, or how it is structured?” Examples could be that the government or overseers (in the case of a voluntary system) made decisions based largely in the absence of stakeholder input (minimal) or that the overseers made decisions based on a formal process above and beyond Federal Register notices for soliciting input from stakeholders (significant). |
5. Transparency | Extent to which interested parties could obtain information about decisions during the development of the framework. The guiding question is: “Were options that the agencies or other decision-making bodies were considering known to the public? Were studies about the pros and cons of these options available?” Examples include that decisionsmakers laid out options and studies that compared and contrasted them to the public prior to completing the framework (high) or the framework was published as a draft in the Federal Register, but the process for arriving at the framework remained unknown to interested and affected parties (low). |
6. Financial resources | Funding and resources and the amount of money allocated to the development of the oversight. The guiding question is: “How sufficient were the funds provided to the developers of the framework?” Examples include that no money was set aside for oversight development (not at all) or ample funds were available for oversight development (sufficient). |
7. Empirical basis | Empirical basis for development of the oversight system, including the amount and quality of evidence (scientific, risk, benefit, or social impact studies) used. The guiding question is: “To what extent was scientific or other objective evidence used in designing the review or oversight process central to the framework?” Examples include that there was a body of evidence used to assess the important features of data submissions, clinical studies, etc., that would be required in the general framework (strong basis) or that during the development of the framework there was little to no information available on the nature or extent of the risks and benefits, or other impacts of products or processes, in that qualitative speculation or predictions were used to generate the framework for oversight (weak basis). |
Attributes: 15 Criteria (A) | |
8. Legal basis | Legal and policy structure, that is, the clarity of the statutes or rules for implementing the specific decisions for processes, trials, research, or products within the oversight framework and achieving its goals. The guiding questions are: “How explicit are the statutes or rules on which specific decisions within the oversight framework are based? Is it clear that the decisionmakers in the framework have legal authority for the actions they propose?” Examples include that there is little ambiguity about what the agencies can or cannot do in the context of a specific application or product (strong) or there is considerable ambiguity and room for interpretation in executing specific decisions (weak). |
9. Data requirements and stringency | Extent to which empirical studies are submitted prior to market approval, release, or clinical trials, and whether there is adequate legal authority to require data. The guiding questions are: “How comprehensive are the safety and other studies required for submittal to authorities? If the system is voluntary, how comprehensive are data generated and are they available for review prior to decisions about release or approval? How much regulatory authority is there for requesting new data?” Examples include that a letter is submitted that describes the composition of the product and there is little authority to request more data and assure compliance (weak) or that a battery of safety studies that extensively address environmental and human health risks are required and backed up with adequate regulatory authority to assure compliance (strong). |
10. Postmarket monitoring | Systematic monitoring for adverse or beneficial events after the product is released or trials begin. The guiding question is: “Is there a science-based and systematic process for detecting risks and benefits after commercial release, or field or clinical trials?” Examples include that once the product is released or trials begin, there is no monitoring for adverse events except anecdotally (little) or that there is an extensive system for reporting potential adverse events and consolidating and evaluating them after a trials begin or product is released (extensive). |
11. Treatment of uncertainty | Reporting ranges of possible values or studies in data that are submitted, whether subpopulations are considered, acknowledgment of areas for which little scientific information is available, and recognition that the hazards may not be well categorized. The guiding question is: “Is uncertainty accounted for qualitatively or quantitatively in data and study submissions?” Examples include that the risk analyses on which decisions are based use uncertainty modeling, account for subpopulations, and qualitatively describe what is unknown (extensive) or point estimates are used based on population averages and narratives of sources of uncertainty are omitted (limited). |
12. Empirical basis | Amount and quality of evidence (scientific, risk, benefit) used for particular decisions. The guiding question is: “To what extent was scientific or other objective evidence used in making decisions about specific products, processes, or trials?” Examples include that high-quality, extensive evidence on safety is required for product submissions, clinical studies, or field trials (strong basis) or low-quality minimal evidence is required for making decisions (weak basis). |
13. Compliance and enforcement | Programs and procedures in place to ensure compliance with the oversight process, and in the cases where there is a lack of compliance, consequences and corrections will result. The guiding question is: “To what extent does the system ensure compliance with legal and other requirements and can prosecute or penalize noncompliance?” Examples include that there is little compliance or enforcement of requirements (weak) or there is much compliance and enforcement with requirements (strong). |
14. Incentives | Incentives, financial or otherwise, for compliance with system requirements. The guiding question is: “Are the stakeholders in the system encouraged to abide by the requirements of the system?” Examples include that there is no incentive structure for compliance (few) or there are many incentives for compliance in the oversight system, beyond product or trial approval (many). |
15. Treatment of intellectual property and proprietary information | Treatment of intellectual property and confidential business information. The guiding questions include: “How does confidential information get treated in applications for approval? How does intellectual property factor in?” Examples include that decisionmakers share with the public business information and intellectual property is dealt with in an adequate way (high) or business information is considered confidential and not shared with the public and intellectual property is not dealt with in an adequate way (low). |
16. Institutional structure | Type of structure of the framework with regard to the number and/or complexity of the actors involved, most notably federal agencies. The guiding question is: “How many agencies or entities with legal authority are involved in the process of decision making within the framework?” Examples include that there is a single authority with a simple and concentrated procedure (simple) or that there are multiple authorities, with a complexity of overlap or potential gaps (complex). |
17. Flexibility | Ability for the framework to be flexible in unique or urgent situations or when new information is obtained. Guiding questions are: “Can products or trials undergo expedited review when appropriate? Can products be withdrawn or trials easily stopped when information on potential risks is presented?” Examples include that the system is rigid in the sense that there is only one option or path and it is difficult to change this pattern with new information (low) or the system provides for numerous ways to account for unique and emerging situations (high). |
18. Capacity | Resources of system, whether expertise, personnel, or financial, to appropriately handle decisions. Is the system well-prepared and equipped to deal with the approvals of trials, products, or processes? The agency staff are stretched thin and do not have time to do a good job with specific decisions (inadequate). Agency staff are provided with resources, expertise, and time to give proper and high quality attention to process (adequate). |
19. Public input | Extent of opportunities for engaged stakeholders (nongovernmental organizations, trade associations, academics, industry, citizen groups, and other affected groups) to provide input into specific or categories of decisions before they are made or during the process. The guiding question is: “Is there a process or opportunities for stakeholders to contribute to discussions or decisions about whether certain products, processes, or trials should be approved?” Examples include that the government or overseers (in the case of a voluntary system) make decisions largely in the absence of stakeholders (minimal) or that the overseers make decisions based on a formal process, above and beyond notice of and comments on rule-making, for soliciting input from stakeholders (significant). |
20. Transparency | Extent to which interested parties can obtain information about decisions during particular decisions that are being made within the oversight framework. The guiding questions are: “Are options that agencies or other decision-making bodies are considering known to the public? Are studies about the pros and cons of these options available? Is the process for how decisions are made clearly articulated to interested parties?” Examples include that decisionmakers divulge the processes and authorities for review, options for, and studies about particular products or events as they are being considered and it is easy for citizens to track the process for and basis of decisions (high) or decisions are published in the Federal Register, but it is difficult to figure out how, when, and by what criteria products, processes, or trials are reviewed (low). |
21. Conflicts of interest | Ability of the system to ensure that conflicts of interest do not affect judgment. Guiding questions are: “Do independent experts conduct or review safety studies? Are conflicts of interest disclosed routinely?” Examples include that there is no disclosure of conflicts of interest and industry largely conducts studies on its own products without external review, except by agency staff, (prominent) or every possible effort is made to avoid or disclose conflicts of interest (avoided). |
22. Informed consent | Stakeholders', patients', research participants', or the public's ability to know, understand, and choose their exposure or the level of risk they accept. The guiding question is: “To what extent does the system supply the amount and type of information so that people can make informed decisions about what they will accept?” Examples include that the public has little information about whether it is exposed, consuming the product, or subject to certain risks (little) or the public is meaningfully informed about its exposure and the risks (extensive). |
Evolution: 1 Criterion (E) | |
23. Extent of change in attributes | Extent of change to the system over time. The guiding question is: “To what extent has the system changed over time?” Examples include that there were no change at all (none) or that there were significant structural changes (extensive). Change can indicate appropriate evolution of the system based on new information or in response to adverse events. |
Outcomes: 5 Criteria (O) | |
24. Public confidence | Public confidence in the system, including views about product or trial safety and trust in actors. Guiding question is: “What do diverse citizens and stakeholders think about the system, including disadvantaged, special, or susceptible populations?” Examples include that there is widespread fear and mistrust among the public (low) or that there is a general feeling that the oversight systems and decisionmakers are doing a good job serving individual and multiple interests and society at large (high). |
25. Research & innovation | Impacts on science, research, and innovation and whether the oversight system encourages research and innovation. The guiding question is: “Has the system led to more research and innovation in the field or stifled it?” Examples include that the oversight system does not stifle research and innovation, and in fact increases it (positive) or the oversight system stifles research and innovation in many ways, perhaps due to time delays or cost of approvals (negative). |
26. Health and safety | Health impacts and whether oversight of the products, processes, or trials leads to impacts on global, national, or local health and safety. The guiding question is: “Does the oversight system impact health and safety in positive ways?” Examples include that the oversight of products or processes is leading to negative health and safety impacts either through delays in approvals (e.g., life-saving drugs) or through approvals of unsafe products (negative) or that the oversight of products, processes, or trials is leading to positive health impacts and increased safety (positive). |
27. Distributional health impacts | How the health risks and benefits resulting from the system are distributed. Guiding questions are: “Are the health impacts equitably distributed? Is there an inequitable impact on specific social or disadvantaged groups?” Examples include that health impacts are not justly and equitably distributed (inequitable) or health impacts are justly and equitably distributed (equitable). |
28. Environmental impacts | The oversight of the products or processes leading to impact on the environment. The guiding question is: “Does the oversight system impact the environment in positive ways?” Examples include that the oversight system has resulted in negative impacts on the environment (negative) or that there have been beneficial impacts on the environment from oversight (positive). |
- Note: This final set of 28 criteria is being used to evaluate the historical oversight models.
5. SYSTEMS APPROACH FOR RELATIONSHIPS AMONG CRITERIA
Oversight systems are complex, and relationships among their attributes, outcomes, and how they develop and change are intricate, dynamic, and involve feedback. As such, hypotheses about what criteria are important for good oversight will be formulated and tested across historical models using a systems approach (Fig. 2) and the final set of criteria (Table III). Systems approaches are useful in cases where mental models (people's understandings of systems) are crucial for analysis given high degrees of complexity, limited empirical information, and multiple types of parameters (Forrester, 1993). It has been suggested that effective methods for learning about complex, dynamic systems include elicitation of participants in the system for their perceptions, creation of maps of the feedback structure of a system from those perceptions, and stronger group processes (Sterman, 1994). We are employing these strategies through our work to better understand oversight systems for emerging technologies (Fig. 1).

Systems approach: types of and relationships among criteria. Criteria were placed into categories of development, attributes, or outcomes of oversight systems, as well as how systems change over time. Relationships among criteria will be explored in future work through the cross-comparisons of historical oversight systems. A systems model with complex interactions among criteria and feedback is depicted. Solid arrows indicate relationships in which outcome criteria are the dependent variables and used for evaluating oversight systems. Dotted arrows indicate relationships between other categories of criteria, which may include independent or dependent variables and evaluative or descriptive criteria. Striped arrows indicate feedback from outcomes to features of oversight systems, and in these cases, outcomes impact dependent variables in other categories of criteria.
However, with our efforts to avoid oversimplifying oversight systems into linear models, we struggled with whether to place our criteria into categories of “evaluative” versus “descriptive,” or “independent” versus “dependent” variables at the outset of our work. Initially, we will consider the outcomes that most people would agree upon as results of good oversight as key dependent variables and evaluative criteria (e.g., the five remaining outcome criteria of public confidence, positive and justly distributed health and environmental impacts, and increased research and innovation). A central question of our approach to assessing oversight systems is whether criteria in the attributes, evolution, and development categories (initially considered as independent variables) positively or negatively impact those key outcome criteria (initially the dependent variables) (Fig. 2, solid arrows). For example, transparency in development or operation of oversight systems (Table III, D5 or A20) is thought to promote public confidence (Table III, O24). In this case, transparency would be considered the independent or descriptive variable and public confidence the dependent or evaluative one.
However, other relationships among criteria will be explored. Several attributes and development criteria are normatively considered good features of oversight, and these can be used on their own to judge an oversight system. Transparency is thought to be a good feature of oversight (D5, A20) in that it promotes ethical principles of autonomy and “rights to know” (Beauchamp & Walters, 1999). Regarded this way, transparency is an evaluative and independent criterion. Yet, other criteria in development or attributes categories, such as institutional structure (A16), can impact transparency, making transparency a dependent and evaluative variable (Fig. 2, dotted arrows). Furthermore, with feedback, transparency could become a dependent and evaluative variable based upon an outcome criterion (Fig. 2, striped arrows). Therefore, transparency can be placed into multiple categories depending on the relationship being explored.
Additionally, some criteria that seem purely descriptive at this point might turn out to be evaluative after historical cross-comparisons of oversight models (Fig. 1, future work). For example, institutional structure (A16) seems to be a description of an oversight system, and there currently is not sufficient evidence in the literature to determine what type of institutional structure is best for oversight of emerging technologies. However, this criterion might turn out to be correlated with positive outcomes or other evaluative criteria, such as transparency, after our cross-comparisons. If so, a hypothesis about institutional structure and its contributions to good oversight can be generated.
As a result of these complexities and in consultation with the Working Group, we chose not to categorize our initial or final criteria with more resolution than the four categories of development, attributes, evolution, and outcomes at this point. There is precedent in the literature for blending multiple types of criteria in analysis of decisions (Morgan & Henrion, 1990a, p. 51). However, our methodology is unique in the application of this approach to oversight systems and consideration of complexities and feedback in oversight.
6. CONCLUSIONS AND FUTURE WORK
We have developed a broad set of criteria to describe and assess oversight systems for emerging technologies and their applications. We derived these criteria using multidisciplinary methods with careful attention to the multidimensional nature of oversight. As discussed, the criteria are both descriptive and evaluative, addressing principles and features of the system, the evolution and adaptability of the system over time, and outcomes of the system. Our work incorporates a diversity of perspectives on oversight and combines quantitative and qualitative methods. From qualitative analysis of the multidisciplinary literature on oversight, we incorporated what many groups, including experts, stakeholders, and citizens, believe to be important for good oversight. Through the use of quantitative elicitation and consensus methods with our Working Group, we have directly included what those familiar with oversight systems believe to be important for oversight assessment.
The resulting criteria reflect this broad consideration of perspectives and literature. In the final set, criteria range in subject matter from the importance of “sound science” in oversight to the extent of opportunities for public input into the design and execution of oversight. The current outcomes criteria (Table III) are heavily weighted toward health and environmental impacts, which reflects the importance that multiple experts and stakeholders place on the ethical principle of maximizing benefits and minimizing harm (Beauchamp & Walters, 1999). Impacts on research and innovation are also included, and these, in turn, are believed to have wider economic impacts on industry and society (NRC, 2006).
Our work is based on the idea that the design and implementation of oversight for nanotechnology should be schooled by the past successes and failures of oversight systems for related technologies. In our future work aimed to derive lessons for the oversight of nanotechnology, quantitative expert elicitation and application of the final criteria will continue to complement other prongs of the IOA approach, which include literature reviews about the performances of historical systems, assessment of public opinion about oversight systems from the literature, semi-structured interviews with stakeholders and experts to evaluate oversight systems, and behavioral consensus methods to discuss and debate attributes and outcomes of systems (Fig. 1). We are now using IOA to actively analyze and evaluate six historical case studies that are related to the application of nanotechnology to biological systems—gene therapy, genetically engineered organisms in the food supply, human drugs, medical devices, chemicals in the environment, and chemicals in the workplace. Through cross-comparisons of these historical oversight cases, hypotheses about what oversight features affect certain outcomes will be generated and tested in order to derive principles and lessons for oversight of related nanotechnology applications.
We propose that comparisons across case studies using a consistent set of criteria will result in defensible and evidence-supported lessons for future oversight systems for nanotechnology products (Fig. 1). The final set of criteria embedded within a broader IOA approach will be used to compare relationships among the development, attributes, evolution, and outcomes of oversight systems across historical case studies. Then, several criteria will likely progress from their descriptive role to being useful indicators of the quality of oversight systems and predictors of positive outcomes that satisfy a majority of citizens and stakeholders. For example, we may find that outcomes such as improved human health or environmental quality (outcome criteria in Table III) are consistently correlated with increased public input (attribute criteria in Table III) across the historical case studies.
In summary, IOA blends theory, methods, and ideas from legal, bioethics, and public policy approaches with the practical goals of providing guidance to policymakers, decisionmakers, researchers, industry, patients, research subjects, consumers, and the public at large. Integrating multiple methods and criteria for oversight assessment will appeal to a wide range of stakeholders bringing a range of perspectives to bear. As we begin to apply the criteria to historical models of oversight, we will also be able to assess the degree of agreement and polarization of expert and stakeholder opinion on historical oversight systems, which will be instructive for diagnosing controversy and how it impacts features and outcomes of oversight.
We expect that our multidisciplinary IOA approach could be widely applicable to other emerging technologies, facilitating assessment of current regulatory oversight systems, the identification of possible changes to existing systems, and the design of new ones. We anticipate that this approach will be a valuable tool for analyzing multiple perspectives, features, outcomes, and tradeoffs of oversight systems. Such an approach that incorporates the viewpoints of key disciplines and the perspectives of multiple stakeholders could help to ameliorate controversy and conflict as new technologies emerge and oversight systems for them are considered and deployed.
Footnotes
ACKNOWLEDGMENTS
This work was supported in part by National Science Foundation NIRT Grant SES-0608791 (Wolf, PI; Kokkoli, Kuzma, Paradise, Ramachandran, Co-PIs). Any opinions, findings, and conclusions or recommendations expressed in this article are those of the authors and do not necessarily reflect the views of the National Science Foundation. The authors would like to thank the Working Group participants: Dan Burk, J.D., M.S.; Steve Ekker, Ph.D.; Susan Foote, J.D.; Robert Hall, J.D.; Robert Hoerr, M.D., Ph.D.; Susanna Hornig Priest, Ph.D; Terrance Hurley, Ph.D.; Robbin Johnson; Bradley Karkkainen, J.D.; George Kimbrell, J.D.; Andrew Maynard, Ph.D.; Kristen Nelson, Ph.D.; David Norris, Ph.D.; David Y. H. Pui, Ph.D.; T. Andrew Taton, Ph.D.; and Elizabeth J. Wilson, Ph.D., as well as collaborators; Efrosini Kokkoli, Ph.D.; Alison W. Tisdale; Rishi Gupta, M.S., J.D.; Pouya Najmaie, M.S.; Gail Mattey Diliberto, J.D.; Peter Kohlhepp; Jae Young Choi; and Joel Larson for their valuable input on the project. Additional contributors to refinement of project methodology include Dave Chittenden; Judy Crane, Ph.D.; Linda Hogle, Ph.D.; William D. Kay, Ph.D.; Maria Powell, Ph.D.; and Michael Tsapatsis, Ph.D. The authors would also like to thank Audrey Boyle for her project management.
Appendix
APPENDIX A: ELICITATION SURVEY INSTRUMENT
Please see the online appendix.