Note to reader: This is Chapter 15 of Personal Privacy in an Information Society: The Report of the Privacy Protection Study Commission transmitted to President Jimmy Carter on July 12, 1977. The full Table of Contents is listed below.
The variety of research and statistical studies that require the collection of information in individually identifiable form is limited only by the interests and concerns of society for human wants and needs, and by the assumptions of researchers and statisticians as to the topics that merit exploration. This chapter reports on the Commission's examination of these activities and recommends action by the Congress and agencies of the Federal government to protect the interests of individuals who are the subjects of research and statistical records developed under Federal authority or with Federal funds.
The Commission's examination of the collection, maintenance, use, and dissemination of information and records about individuals for research or statistical purposes was premised on the following observations.
First, research and statistical activities generally do not lead to an immediate or direct benefit for the individual subject as such. The researcher asks for the individual's participation or for information about him, but society as a whole, rather than the individual, is the ultimate beneficiary.
Second, research and statistical activities depend heavily upon the voluntary cooperation of the individual in providing accurate and reliable information. On the theory that responses will be more candid and complete if individuals are convinced that the information they provide will not come back to haunt them, researchers who directly question subjects usually assure them confidentiality and, when the study design calls mainly for observation, the observer usually promises anonymity.
Third, assuring that information will not be disclosed to third parties in individually identifiable form is especially important in research on deviant behavior, such as drug and alcohol abuse, gambling, and prostitution; in studies of topics such as abortion and institutionalized discrimination; and in probes of public attitudes on controversial social issues, such as busing and welfare.
Fourth, both government agencies and research institutions outside government are undertaking more and more of the kinds of studies that require assurances of confidentiality or anonymity. The vast banks of records on individuals built up by the Federal government in the course of performing its legitimate functions constitute a valuable data resource for research and statistical activities. Some of these records are currently released in anonymous form for general public use. Because careful removal of the elements of individual identification is a complex and expensive process, however, the rich lode of agency data has barely been tapped.
Fifth, different research and statistical projects use widely differing methods of collecting information about individuals. These differences affect the relationship between researcher and subject, which, in turn, affects the individual's ability to comprehend and control the way information about him is used. In a laboratory setting there is likely to be a close working relationship between researcher and subject. Surveys based on personal interviews similarly involve a direct, if somewhat more transient, relationship. Telephone interviewing, of course, weakens the relationship considerably, and mail surveys can be conducted without any personal contact. When information is extracted from program records or data archives, the individual subject is seldom even aware that information about him is being used.
After examining the standards and procedures for the protection of
personal privacy in a number of research and statistical activities, the
Commission reached three main conclusions:
The Commission's principal objective is to strike a proper balance between the individual's interest in personal privacy and society's need for knowledge. In research and statistical activities, the threat to personal privacy comes mainly from information and records collected and maintained in individually identifiable form. Thus, the Commission believes that the first and fundamental step toward achieving the desired balance is to establish a clear boundary between the use of such information (regardless of source) that is collected, maintained, or disseminated for a research or statistical purpose, and the use of information that is collected, maintained, or disseminated for other purposes. Assuming that such a functional boundary can be established, the Commission proposes policy and rules for the transfer of individually identifiable information or records within and across the boundary, and identifies the role it believes an individual should play in such transfers.
The Commission's public-policy objectives here are, as in other areas of its inquiry, to minimize intrusiveness, to maximize fairness, and to create a legitimate, enforceable expectation of confidentiality. The recommendations in this chapter aim mainly at achieving the third goal; that is, at strengthening and systematizing the confidential status of individually identifiable information used for research and statistical purposes. A clearly marked boundary between the use of information for such a purpose and its use for administrative or other purposes is an essential first step in eliminating the possibility that the information an individual contributes directly or indirectly to a research or statistical activity will be used to his detriment.
Nevertheless, minimizing intrusiveness and maximizing fairness are also of concern to the Commission here. The close dependence of research and statistical activities on public cooperation acts as a natural brake on intrusiveness in the nature of the questions asked of research subjects. The notice and consent requirements specified in Recommendations (10), (11) and (12), below, would reduce intrusiveness by reinforcing the individual's right to refuse to participate in the data-collection process. They also promote fairness in collection practices by specifying the ground rules for use and disclosure of the data collected. Recommendation (13) promotes fairness by assuring the individual an opportunity to see and copy any record about himself that is disclosed unless the record keeper can guarantee that the record itself, or the individually identifiable information it contains, will not be used to his detriment.
The Commission's study focused on federally controlled or assisted research and statistical activities and thus its recommendations are confined to research and statistical activities in that category. This limitation should not be interpreted as a judgment by the Commission that the protection of individually identifiable data is of concern only when there is some Federal involvement. Rather, it recognizes that most of the country's organized research and statistical activities are at least partially dependent on Federal funding, and that where there is Federal involvement, some means of protecting record confidentiality already exist and are being used to at least some degree. (1)
The Commission considers its general principles valid as guidelines for research and statistical activities beyond the reach of Federal involvement. The Commission does not have enough evidence to judge whether these guidelines will need modification to make them generally applicable, or to suggest policy mechanisms for implementing them where research and statistical activities are independent of the Federal government. The Commission does believe, however, that the recommendations in this chapter can serve as a paradigm for the guidance of all research and statistical activities.
RESEARCH AND STATISTICAL ACTIVITIES
The term research will be used in this chapter to refer to any systematic, objective process designed to obtain new knowledge, regardless of whether it is "pure" (aimed at deriving general principles) or "applied" (aimed at solving a specific problem or at determining policy). Statistics refers both to the data obtained through enumeration and measurement and to the use of mathematical methods for dealing with data so obtained. Statistical methods can be descriptive, that is, any treatment designed to summarize or describe important features of data, or inferential, that is, techniques for arriving at generalizations that go beyond the sample being analyzed.
The research and statistical activities that use individually identifiable information draw huge quantities of it from Federal administrative records, both for routine production of statistical reports and for the performance of statistical analysis or other research tasks. Researchers draw other information directly from individuals as part of the research process. As to Federal agencies, some conduct the bulk of their research themselves. For example, the Bureau of the Census not only conducts all its own surveys but also performs data-collection services for other agencies on a reimbursable basis. A great deal of Federal agency research, however, is contracted out to private and semi-public research organizations and Federal grants support numerous research projects at other levels of government and in the private sector.
A typical research project starts with a hypothesis and proceeds through four stages: data collection; data processing; data analysis and interpretation; and finally, publication or dissemination of findings. Before data collection can begin, assumptions must be made about what information is relevant to the hypothesis and what kind of individuals are appropriate data subjects or respondents. Processing may involve anything from simple arranging and manual tabulations to complex coding and sophisticated computer analysis. Data storage and retrieval may rely on anything from handwritten notes and human memory to punched cards, magnetic tapes and discs, films, and computer memory. Data can be analyzed and interpreted in terms of the original hypothesis, or-when the research design less closely approximates the canons of the scientific method-in the light of less clearly articulated assumptions. Statistical manipulation may or may not be required. For some studies, a simple tabulation or descriptive case study may be the result. The final step is a research report to make the findings available to others.
In most studies, the researcher or statistician is interested in the individual primarily as a carrier of attributes or characteristics of groups or distributions. Individual data are often used as major building blocks during the analytical process, but in the final stage both research findings and statistical data are characteristically presented in aggregate form. In research, the purpose is to discover and analyze relationships among variables; in statistics, the purpose is to define average characteristics or discover their distribution or both. Individual data are therefore grouped according to characteristics and reported in the aggregate.
To illustrate, suppose the Department of Labor, for its own policymaking purposes, sponsors a study comparing and contrasting two manpower training programs. The project design requires extensive questioning and observation of two groups of trainees over a two-year period during which at least three series of interviews are conducted. Despite the research team's close, long-term involvement with the research participants, no information supplied by the respondents is released until the final report and then the information is in statistical summary form. If the final report contains quotations from respondents for illustrative purposes, they are presented anonymously, not as individual data with identifiers attached. The bulk of the data are presented in tables according to categories, such as training program A or B, sex, extent of formal education previously received, training, occupation, attitudes toward training programs, and whether participation was mandatory or voluntary.
In most cases, omitting identifiers, such as name, address, telephone number, or subject identification number, is enough to protect the participants' anonymity. In certain cases, however, other information can identify the respondents, as when the study is about people in a relatively unusual occupation such as network TV anchorwomen, or is limited to people in a specific geographic area or income bracket. In such cases, characteristics such as occupation, age, or income may have to be suppressed to preserve the participants' anonymity.
It is often difficult to decide in advance which information beyond the standard items of name, address, or telephone number will or will not constitute identifying information. It must be emphasized, however, that research and statistical activities are undertaken not in the investigative sense of discovering what there is to know about identified individuals, but in pursuit of systematic knowledge about human beings in groups. A distinction should also be drawn between the use of information for research and statistical purposes and the methods employed for information gathering and analysis. The methods researchers and statisticians use in data collection and analysis may also be useful for purposes wholly unrelated to research and statistics, notably for law enforcement, evaluating compliance with program requirements, assessing performance, and even for commercial exploitation. Thus, the duties and safeguards recommended in this chapter do not apply to all information about individuals collected or used according to what may be considered research or statistical methods.
In the discussion of the Commission's recommendations, the following definitions apply:
Individual: any citizen or permanent resident of the United States.
Individually Identifiable Form: any material that could reasonably be uniquely associated with the identity of the individual to whom it pertains.
Research and Statistical Information: any information about an individual, obtained from any source, used for a research or statistical purpose.
Research and Statistical Record: any item, collection or grouping of information maintained in any form of record solely for a research or statistical purpose.
Research and Statistical Purposes: the developing and reporting of aggregate or anonymous information not intended to be used, in whole or in part, for making a decision about an individual that is not an integral part of the particular research project.
Functional Separation: separating the use of information about an individual for a research or statistical purpose from its use in arriving at an administrative or other decision about that individual.
THE PRINCIPLE OF FUNCTIONAL SEPARATION
Federal agency research and statistical activities tend to be reasonably well defined and performed by organizational components functionally separated from policy and decision-making units. This is also generally characteristic of research conducted by academic institutions and by organizations specializing in research, but less likely to be true of research and statistical activities conducted by State or local governments. Even where organizational separation exists, however, individually identifiable information and records used for research or statistical purposes can be commingled with information and records used for administrative purposes. This can occur by design, as well as by chance, as when a continuing study of a program serves not only as a source of statistical summaries but also as an element in determining the eligibility of particular individuals for benefits under the program. Furthermore, in some social experiments the same individual may be both a beneficiary under the program and a research subject. Thus the flow of information from researchers to program personnel who make decisions about the individual may be loosely restricted or not restricted at all.
Existing law does not clearly discourage such commingling. Neither does it clearly restrict the exchange of information between research or statistical components and administrative units of an organization, nor necessarily preclude access to individually identifiable data maintained by researchers for investigative, legislative, or judicial purposes. The Federal Reports Act [44 U S. C. 3501-3511], which prescribes the central structure for Federal agencies' data management practices, was in fact framed to facilitate data sharing among agencies in order to reduce the reporting burden on business by eliminating redundant data collection. It does not, however, license unrestricted flows of information among agencies and there are laws that extend confidentiality protection to data collected or maintained by certain agencies, or to some particular types of information under the control of any agency.
The Commission believes that existing law and practice do not adequately protect the interests of the individual data subject. It perceives two main deficiencies. First, the individual needs more protection from inadvertent exposure to an administrative action as a consequence of supplying information for a research or statistical purpose. The individual is entitled to protection when he supplies information indirectly by way of applying for benefits under an agency program that uses client information for research or statistical purposes, just as when he volunteers information directly to a researcher. Second, public confidence in the integrity of research and statistical activities and in the collection and use of the data on which they depend needs strengthening. Research and statistical results are too important to the common welfare to risk eroding public trust in the activities and processes that produce them.
The Commission believes both needs will be met if the data collected and maintained for research or statistical use cannot be used or disclosed in individually identifiable form for any other purpose. To erect such a barrier, however, there must be a clear functional separation between research and statistical uses and all other uses. The separation cannot be absolute in practice but the principle must be established that individually identifiable information collected or compiled for research or statistical purposes may enter into administrative and policy decision making only in aggregate or anonymous form. The reverse flow of individually identifiable information from records maintained by administrators and decision makers to researchers or statisticians can be permitted, but only on the basis of demonstrated need and under stringent safeguards.
There are two classes of exceptions to the principle of functional separation. One is when the data subjects directly receive the benefits of the research findings, as in experimental medical treatment or testing, or experimental housing or education projects. The other is when societal imperatives outweigh the individual's claim to protection.
The Commission recognizes that it is not always easy to decide whether a particular investigative purpose can properly be considered research or statistical. Program evaluation, for instance, is considered evaluation research in some cases, but in others it is considered a standard operational component of an agency's mission. For functional separation to protect the individual's interest, the criteria for determining what is a research or statistical activity must be consistently applied.
The threshold policy question is the extent to which innovative administrative and program management practices, including quality control, are to be considered research and statistical activities. The answer lies in what functional separation is meant to achieve. The aim of functional separation is to prevent individually identifiable research or statistical information from affecting or modifying decisions about the individual to whom the information pertains. Consequently, if a given activity can gain nothing from identifying particular individuals-if, for example, its interest is only in uncovering underlying principles of good management practicethe investigation can safely be considered research, and respondents informed accordingly. If, however, the reverse is true, the investigation cannot be considered a research or statistical activity and the respondents cannot be promised confidentiality.
The Commission sees the need for a specific set of standards and guidelines for organizational information practices to limit the exposure to risk of the individual who contributes information, either directly or indirectly, to a research or statistical activity. The standards and guidelines should also strengthen the ability of the individual to protect himself. The Commission believes that standards and guidelines can do this without discouraging the rigorous research and statistical activities that society needs, provided that clear functional separation is accepted as a basic principle. Accordingly, the Commission recommends:
That the Congress provide by statute that no record or information contained therein collected or maintained for a research or statistical purpose under Federal authority or with-Federal funds may be used in individually identifiable form to make any decision or take any action directly affecting the individual to whom the record pertains, except within the context of the research plan or protocol, or with the specific authorization of such individual.
ASSURING COMPLIANCE WITH THE SEPARATION PRINCIPLE
Establishing the principle of functional separation within a Federal agency requires three preliminary steps:
Because these three steps are interrelated, the measures recommended for implementing each of them are prescribed below in order of dependence. Thus, acceptance of each recommendation assumes acceptance of those that precede it.
DECIDING WHAT USES AND DISCLOSURES ARE PROPER
There would be little chance that the principle of functional separation would be violated if research and statistical findings could not enter into decision-making processes in individually identifiable form. Strictly applied functional separation would eliminate the disclosure of individually identifiable research and statistical data for any purpose other than a research or statistical one.
Most researchers regard the pledge of confidentiality as the sine qua non of voluntary participation in research for the reasons noted earlier. Recognition of its necessity underlies the statutory protections for the confidentiality of data collected by the Bureau of the Census [13 U.S.C. 8,9] and the National Center for Health Statistics [42 U.S.C. 242m], as well as the confidentiality protection that special legislation provides to particular research projects using alcohol and drug addiction treatment records. [42 U. S. C. 4582; 21 U. S. C. 1175]
Nor is it only protection of information collected directly from individuals for a research or statistical purpose that demands consideration. The confidentiality of research data obtained from other sources, including administrative records, without the immediate knowledge of the research subject is equally significant to both the researcher and the policy maker. Public attitudes are volatile, and the public's willingness to participate, or ultimately to consent to researcher access to administrative records, is dependent on trust in the integrity of the process.
Conversely, there is the public's right to hold public agencies accountable for efficiency and economy in the discharge of their duties. Assuring compliance with laws and regulations in the conduct of government programs, carrying out criminal law enforcement and investigative functions, and assuring fairness in civil and criminal court proceedings can all create demands for access to individually identifiable research and statistical records that are difficult to deny as a matter of public policy. So far such value confrontations have not been common but there are signs that they will grow in number and thus must be taken into account. Because other justifiable goals may impinge on the individual's privacy interest and on the integrity of the relationship between researcher and data subject, the Commission believes that it would be unrealistic to deny all claims for disclosure of all types of research and statistical records under all circumstances. At present, the mechanisms agencies use to resolve confrontations over disclosure are largely ad hoc and tend to vary considerably depending on the source of the request and the class of information sought. Judicial demands for individually identifiable research or statistical information, for instance, may be resolved on Constitutional or common law grounds, or by invoking State or Federal protective statutes. In some instances, the newsman's privilege, established by State statute, has been invoked to apply to research information.
Section (3)(b)(1) of the Privacy Act of 1974 allows one component of a Federal agency to ask another component of the same agency for access to research or statistical records for use in administrative decision making. These requests, which the Privacy Act treats as internal need-to-know disclosures, are dealt with administratively, and the result may be governed or influenced by the existence or absence of independent statutory directives. For example, in the Commission's hearings on medical records (2), the Director of the National Center for Health Statistics (NCHS) described the kind of dilemma that can arise. Another component of the Department of Health, Education, and Welfare (DHEW), NCHS' parent agency, had requested individually identifiable data collected by the Center on family planning procedures for the purpose of checking whether the consent procedures for sterilization, as reported to NCHS, were adequate. NCHS declined to release the information, citing the confidentiality provisions in its own enabling statute. The Secretary of DHEW did not compel NCHS to make the data available to the other DREW component although he had discretionary authority to do so under the NCHS statute. He might well have done so if the legislative history of the NCHS statute had not made clear that such information should be disclosed without individual consent only for research and statistical purposes.
There are now two basic mechanisms for limiting the use and disclosure of records maintained for research and statistical purposes. One is to protect the confidentiality of such records by statute, a method that can specify the criteria for disclosure with some precision. The other is for the agency maintaining such records to exercise its discretion in responding to requests for disclosure as they arise.
Sole reliance on agency discretion has serious shortcomings. An official with responsibilities for both research and administrative activities is not always the best fulcrum on which to balance competing claims, as the NCHS dilemma suggests. It may be particularly difficult for an administrative official to be entirely objective in weighing the pros and cons of a disclosure when the request is in support of a program within his own agency, and the disclosure is requested by agency personnel on a need-to-know basis. Relying on agency discretion alone may well dilute any pledge of confidentiality for both researcher and data subject.
Objective criteria and orderly determination of the propriety of voluntary disclosures are better safeguards for the integrity of the research process than ad hoc ones, and are easier to define and adhere to uniformly when they are established by statute rather than by administrative action. It is particularly important that the prospective data subject know of, and know that he can rely on, the limitations on use and disclosure as a basis for consenting to participate in any research project. It is equally important that users have no doubt about what uses are permitted and what disclosures they may make.
Two types of statutory exceptions to the principle of functional separation deserve special consideration: (I) disclosures in response to compulsory process; and (2) disclosures for auditing purposes.
The Commission recognizes that several statutes presently protect some specific types of data and all the records of some specific agencies, such as the Bureau of the Census, against compulsory process. No Federal statute, however, protects the confidentiality of individually identifiable research data in general. Consequently, they are subject to no uniform standard of protection.
As explained above, the Commission believes that when an individual is asked to reveal information about himself in confidence-less for his own than for society's benefit-the disadvantages of making the information available for purposes other than those stated to the individual usually outweigh the advantages accruing to other users. This is particularly true when the information is available through compulsory process because the disadvantages of this type of disclosure to researcher and subject far outweigh the advantage to any law enforcement investigation. If research and statistical data remain subject to compulsory process, regulatory agencies can seek access to research data for law enforcement and compliance control purposes, thereby immeasurably increasing the risk to the individual of participating in research. Furthermore, it is confusing for the research subject if some of the information he may provide is protected by law from compulsory disclosure but other information is not. For these reasons, the Commission believes the present legal protections are unacceptably ambiguous and far too limited.
There are at least three ways to give individually identifiable research information better protection from compulsory process: (1) by constitutional interpretation; (2) by statute; and (3) by administrative action. The DHEW regulations regarding research involving human subjects [45 C.F R. 46] and the grant and contract instruments of several Federal sponsors of research illustrate the administrative approach. Better and wider use could certainly be made of this approach, but it is limited in that administrative regulation only governs the conduct of persons bound in one way or another to the agency issuing the rules. Furthermore, administrative regulations by themselves do not take precedence over legislative or judicial demands for information, and the courts are unlikely to recognize an administrative agency's plea that they do so.
A constitutional approach also has practical limitations. The experience of newspeople who have claimed the privilege of confidentiality as a First Amendment right does not encourage the hope that an analogous researcher's privilege would find judicial support. Although in civil litigation, where countervailing Fifth Amendment or other powerful rights are not asserted, the courts may be willing to recognize a privileged status for the researcher, (3) in criminal prosecutions where a researcher privilege might infringe upon the Sixth Amendment right to cross-examine, courts are not likely to heed assertions of Constitutional privilege based on nothing stronger than a generalized concern for the integrity of the research process.
Statutory protections from compulsory process can be provided either by general legislation to be interpreted by the courts with or without criteria; or by authority to grant immunity from compulsory process according to specified criteria that is delegated by statute to one or more administrative entities. Each method has its strengths and weaknesses. The former may be simpler but it is less predictable, because it would require the courts to adjudicate on a case-by-case basis. The latter, also subject to judicial review, can provide greater uniformity, but in turn has the disadvantage of interposing an additional level of administrative review.
Current Federal law, as noted above, provides examples of several types of protections of research data from compelled disclosure. The statute regarding disclosure of Bureau of the Census records [13 U.S.C. 8, 9] prohibits the use of individual census records for any purpose other than the statistical purposes for which they are created, and further prohibits anyone other than Bureau personnel from examining them. The National Center for Health Statistics has limited statutory protection [42 U.S.C. 242m] for all the individually identifiable research information it collects. Use for any purpose other than that for which the information was collected is prohibited except as authorized by DHEW regulations, and no publication or disclosure of information in individually identifiable form is permitted except with the consent of the individual to whom it pertains.
Patient records maintained in connection with any drug or alcohol abuse program or research activity conducted, regulated, or directly or indirectly assisted by a Federal agency or department, may only be disclosed for specified purposes, namely, to qualified personnel for specific research or audit provided that patient identities are not disclosed in any resulting reports, or pursuant to a court order issued for good cause. [42 U. S. C. 4582; 21 U. S. C. 1175]
The Secretary of DHEW may also authorize researchers engaged in mental health or alcohol or drug abuse research to withhold names or identifying characteristics of data subjects, and this immunity covers them in any Federal, State or local civil, criminal, administrative, legislative, or other proceeding. [42 U.S.C. 4582] The Law Enforcement Assistance Administration (LEAA) has both statutory immunity from compulsory process and a statutory prohibition against voluntary disclosure. [42 U.S.C. 3371] The LEAH statute prohibits any Federal employee or recipient of assistance under it to use or reveal individually identifiable research or statistical information for any purpose other than the one for which the information was obtained. In addition, copies of any individually identifiable information furnished under the statute are immune from legal process, and cannot be admitted as evidence without the individual data subject's consent, or used for any purpose in any action, suit, or other judicial or administrative proceeding.
Such statutory protections demonstrate that there are mechanisms that can effectively protect the subjects of research or statistical data from the hazards of compulsory disclosure and at the same time hold researchers accountable for unauthorized use or voluntary disclosure. Immunity can be provided for the researcher, protecting him from being compelled to disclose information; or to the research relationship, creating privilege for the Research and Statistical Studies 579 researcher-subject communication; or to protecting the interests of the data subject. Since researcher immunity would interfere with researcher accountability, the Commission considers it unacceptable. Similarly, a researcher testimonial privilege is deficient in that it expresses paramount concern for the research process, rather than for the individual data subject. It is the Commission's position that the relationship between the individual and the record-keeping organization is the one that needs to be controlled. The Commission, therefore, strongly favors statutory immunity which protects the rights and interests of the individual and also includes researcher accountability for voluntary disclosure.
The Commission has concluded that the individual's privacy interests as well as his right to refuse to testify against himself demand, albeit indirectly, that research and statistical records be generally immune to disclosure compelled by judicial order. Total immunity, however, is too broad. In part to protect research subjects, and in part to protect society's interest in assuring proper conduct by the researcher, access to research records ought to be permitted (though carefully controlled) when a researcher or research institution is under investigation for possible violation of law and confidential records constitute the only available source of information necessary for the investigation. If a research activity is suspected of having unnecessarily endangered research subjects, as in the infamous Central Intelligence Agency research on LSD, for example, or if a researcher is suspected of fraud, access to confidential research records may well be the only way to establish guilt or innocence.
There are also arguments for a statutory exception to the nondisclosure rule if disclosure is essential for auditing or evaluating Federal and federally funded research and statistical activities. Management and fiscal accountability are, after all, as fundamental to the integrity of the research process as is assuring the confidentiality of information. Nevertheless, the Commission believes that such access should be permitted only if the Congress has made a public-policy determination that audit or evaluation is necessary in the public interest; that is, if audit or evaluation has been authorized by statute. Even then, however, the Commission recommends stringent restrictions on disclosure.
There must also be an exception for transferring research and statistical information in individually identifiable form to archival storage. There are differences of opinion between the Bureau of the Census and the National Archives and Records Service as to how many years should elapse before census records are transferred to the Archives where they would become available to researchers. The Commission regards this as a matter of public policy that the Congress can resolve by redefining the statutory disclosure authority of both agencies.
There is also the researcher's moral and legal obligation to report acts of interpersonal violence he either witnesses or can reasonably anticipate. The Commission also believes that serious threats to the health and safety of an individual may, in some cases, justify violation of record confidentiality. Finally, the Commission believes that one of the surest ways to protect the interests of the individual is to give him a legal remedy when his rights have been violated. The Commission therefore considers it essential for the individual to have the legal capacity to challenge researcher users or record keepers he believes are violating his interest in, or are demanding unwarranted access to, information about him, or to obtain redress after his rights have been violated. Consequently, for areas in which such minimum protections do not now exist, the Commission recommends:
That the Congress provide by statute that any record or information contained therein collected or maintained for a research or statistical purpose under Federal authority or with Federal funds may be used or disclosed in individually identifiable form without the authorization of the individual to whom such record or information pertains only for a research or statistical purpose, except:
And further, that should information be disclosed under any other conditions, an individual research subject identified in the information disclosed shall have a legal right of action against the person, institution, or agency disclosing the information, the person, institution or agency seeking disclosure and, in the case of a court order, the person who applied for such an order.
CONDITIONS FOR STATUTORILY AUTHORIZED AUDITS
The legitimacy of access to individually identifiable research or statistical information for auditing purposes recognized in Recommendation (2) leaves some important issues to be resolved. The project manager who monitors an agency research project may have to examine individual data as an integral part of his responsibility, and the agency itself has an internal management obligation to audit. Also, the General Accounting Office (GAO) conducts external audits under its statutory authority to hold agencies and their research contractors accountable. While an audit is primarily financial, the audit team may occasionally need access to raw data about individual data subjects.
It is good information practice to incorporate safeguards for individually identifiable data as early as possible in the collection and processing stages of a research project, since every subsequent stage of research further exposes the data and increases the potential for breaches of confidentiality. The same reasoning applies to the auditing process. There are various ways for the auditing process itself to incorporate safeguards without losing efficiency.
First, the necessity for access to individually identifiable data can be minimized by inserting review mechanisms into the project plan and creating an audit trail. Second, if audit requirements cannot reasonably be met without access to individually identifiable data, the audit team can adopt procedures such as on-site inspection of the data or stripping the data of identifiers before they are removed from the research site. Such matters should be negotiated with the audit team to assure that the record subjects' interests are represented.
When auditors take individually identifiable data away from the research site, other safeguards will be needed. Inadvertent disclosure is obviously a concern in such instances. In addition, auditors may be less responsive to the assurances of confidentiality given when the data were collected than the researcher who collected them. This would be especially true for data open to compulsory process. The researcher would tend to resist access demands for law enforcement or other judicial and legislative purposes on principle, but the auditor might not. Unless the auditor is prohibited by law from disclosing individually identifiable data, some prior agreement between auditor and researcher covering compulsory demands would appear to be necessary. This is the reason the Commission recommends statutory restrictions on the disclosure of individually identifiable data.
Aside from the possibility of inadvertent or compulsory disclosure to third parties, there is the possibility that the auditor will use the information as the basis for a reinterview of the data subject. The issue is particularly sensitive from the research standpoint. If recontact occurs before the research study is completed, the experience may modify the research environment and bias the results. For longitudinal studies, recontact by the auditor may make it more difficult for the researcher to obtain the further cooperation of data subjects, biasing the results in ways difficult to compensate for statistically. If information provided for research purposes may be used by auditors for individual review and perhaps recontact, the data subject should be notified of the possibility in his initial interview with the researcher.
In addressing these issues, it is helpful to keep in mind the distinction made earlier between studies that do not benefit data subjects and those that do. Studies connected with experimental social programs are an example of the latter. In the usual survey situation, respondents cooperate from a simple desire to contribute to the general fund of knowledge or perhaps out of a sense of social duty. There is nothing in such transactions to suggest that the information solicited will go farther than the stated use; indeed, usually the data subject gets assurances, not necessarily legally binding, that the information will be reported anonymously and used only for research purposes. If this is the case, an agency would be hard put to defend any use of the information as a basis for any action-particularly adverse action-with respect to an individual respondent. On the other hand, a person who receives benefits from participating in a pilot or other experimental program has, it can be argued, entered into an implied contract with the agency and assumed the responsibilities concurrent with the benefits. The information so collected can be considered to have administrative as well as research implications.
In dealing with Federal programs of the latter sort, the General Accounting Office has taken the position that its fiscal obligations to Congress require it to audit all aspects of experimental programs. According to the GAO, proper performance of its duties may oblige it to recontact data subjects and, in some circumstances, to reinterview participants as a check on the information collected by the original interviewer. The GAO has a substantial statutory base for its claim to this authority, and the scope of the examination-of-records clause which is mandatory in Federal procurement contracts has been construed broadly for auditors. On the other hand, an audit may uncover fraud or some other actionable breach of conduct on the part of the interviewer or program official. If that happens, and if there are no restrictions on the auditor's access to or use of individually identifiable data, completely innocent data subjects can be drawn into the investigation and have individually identifiable information about them made part of the record. Such disclosure can harm both the data subject and the research process without any corresponding benefit to the audit process or the investigation. Clearly, the data subject in these circumstances deserves complete insulation from disclosures of information about him. Safeguards should be required to minimize any untoward consequences arising from his unfortunate connection with someone else's wrongdoing.
The audit may also uncover some reportable condition or unlawful behavior on the part of the data subject himself. Unless the researcher is legally obliged to report such a condition or offense, the auditor ought not to be permitted to do so. Information about the data subject should not be disclosable for law enforcement or other compliance purposes, nor should audit results be used for such purposes. Auditors who so disclose individually identifiable information should be subject to legal sanctions.
If compliance or law enforcement actions may result from the auditing of an experimental program, anyone asked to participate in the program should be so advised in advance. No data subject should be persuaded to participate in or provide information for a research project under assurances of confidentiality only to discover later that the data he has supplied have been used in ways he was not told in advance to expect.
In sum, the position of the Commission is that auditors should have as little access as possible to individually identifiable information obtained for research purposes, and that when audit access is necessary, there should be safeguards to protect individually identifiable data from inadvertent disclosure. In addition, an auditor should recontact research subjects only as a last resort. Research plans should be designed to include adequate monitoring and audit trails so as to minimize the need for recontact, and alternative methods of validation should be developed. To minimize harm to the individual or to the research results, any recontact should be negotiated, in advance, with the researcher.
Moreover, individually identifiable information obtained by the auditor should not be open to administrative use or compulsory process. Disclosure of individually identifiable data by the auditor should be governed by the restrictions applicable to the researcher. There should be sanctions for unauthorized use or disclosure of the information. The data subject should also be protected from disclosure of individually identifiable information about him when an audit involves the researcher in a civil suit or criminal proceeding. Finally, the prospective data subject should be adequately informed, in advance of participation, of the possibility of recontact for audit, if any, and of any compliance or law enforcement use of the information which could reasonably be expected to result from an audit.
The Commission, therefore, recommends:
That when a Federal statute expressly authorizes disclosure in individually identifiable form of a research or statistical record for the purpose of auditing or evaluating a Federal or federally funded program, such statute should prohibit the use or disclosure of such information to make any decision or take any action affecting the individual to whom it pertains, except as authorized by that individual, or as the Congress specifically permits by statute.
PROCEDURES TO PROTECT CONFIDENTIALITY
Given the basic principle of functional separation and the recommended standards of confidentiality, there remains the question of responsibility and procedures for maintaining those standards. The basic arguments for functional separation apply here as well: the public's trust in the confidentiality of research and statistical records needs strengthening, as do the legal safeguards protecting them.
Guidelines for analyzing the risk and establishing appropriate safeguards for individually identifiable information are essentially the same whether the information is used for research or for administrative purposes. Confidentiality safeguards for research and statistical data do not differ appreciably from those for other information about individuals, except that additional safeguards are needed in organizations which do more than conduct research and statistical activities. In those organizations, intramural transfers of information need monitoring in order to maintain functional separation and to prevent internal administrative or management uses of new information about individuals generated by a research or statistical activity.
The Commission believes that the single most important procedure for maintaining the confidentiality of research and statistical data is the prompt removal and destruction of identifiers. This procedure is already practiced in many research organizations. Ideally, identifiers should be removed or destroyed as soon as the data are collected and verified.
The Commission recognizes that identifiers must be retained in some kinds of research, most notably longitudinal and panel-survey studies which refer to the same respondents from time to time, but retention should be the exception, not the rule. The decision to retain identifiers should not be left solely to the discretion of researchers; it should be a matter of public policy, or a decision of agency administrators. Furthermore, the retention of identifiers should trigger special precautions, such as maintaining face-sheet information separate from the survey instrument, or recording personal identifiers in a separate file that is cross referenced to the rest of the data.
Accordingly, the Commission recommends:
That any Federal agency that collects or maintains any record or information contained therein in individually identifiable form for a research or statistical purpose should be permitted to maintain such records or information in individually identifiable form only so long as it is necessary to fulfill the research or statistical purpose for which the record or information was collected, unless retention of the ability to identify the individual to whom the record or information pertains is required by Federal statute or agency regulation.
The Commission believes that the legal requirements for confidentiality should extend to all the research and statistical activities conducted unde Federal sponsorship, the only question being whether Federal agencie should require them as a condition of funding. Federal agencies contract out much of their research and statistical work, and through grants support much private and academic research on human subjects. The relationship c contractors and grantees to their funding agency varies widely, especially in the degree of Federal control. In theory, the contractor works to the agency's specifications and is required to deliver a defined product, whereas the grantee is funded to study a stated question as it sees fit and report its findings, whatever they may be. In actuality, however, these differences are more differences in form than substance, since the grantee is often required to develop and follow a detailed, exacting protocol. If the agency influences a grantee's data collection methods, the Office of Management and Budget (OMB) must approve the reporting form used by the grantee just as it does those of contractors.
From the standpoint of fair information practice, there is no compelling reason to differentiate between grantees and contractors, or between different classes of contractors if they all collect essentially the same sort of data and perform similar activities for similar research purposes. The important question is: does the Federal agency have the responsibility for the confidentiality of information disclosed to and collected by its contractors and grantees, or is the researcher solely responsible? The Commission's answer is that agencies have de facto responsibility for monitoring the performance of their contractors and grantees and that it makes them responsible for record confidentiality as well. To fix responsibility explicitly, however, the Commission recommends:
That whenever a Federal agency provides, by contract or research grant, for the performance of any activity that results in the collection or maintenance of any record or information contained therein in individually identifiable form for a research or statistical purpose, the terms of such contract or research grant should:
Federal agencies have several alternative mechanisms for implementing Recommendation (5). Contracts can specify safeguards or require agency approval of the contractor's safeguard procedures. An agency could simply require applicants to certify that they would protect the confidentiality of individually identifiable data and be liable if their performance fell below the agency's statutory standards. These alternatives obviously entail different levels of responsibility that the Commission is not prepared to assess. The Commission's concern is that Federal agencies take care to see that proper procedures are established and that grantees and contractors, in turn, are given clear responsibility for safeguarding the data under their control. The Commission is also concerned that research organizations not be overburdened with a multiplicity of different implementation requirements, and urges that agencies standardize the safeguard procedures they require. The Office of Management and Budget should take the lead in seeing that this is done.
In addition to procedural safeguards, individually identifiable data need technical and administrative safeguards. When an agency publishes research and statistical data as anonymous microdata (that is, data in the form of individual records stripped of identification), it publishes detailed information about the characteristics of individuals and must take care to avoid publishing details that can identify individuals on the basis of unique characteristics or as members of an identifiable group.
There are various techniques for avoiding this which an agency can further develop and apply. Scholars in the field and professional associations, like the American Statistical Association, are paying considerable attention to the problem, as are agency task forces. An OMB task force, for example, is currently working on methodologies for protecting the identity of individuals in statistical reports (4).
Techniques for minimizing identifiers or separating identifiers from responses include collecting the responses without names or under aliases; randomizing responses; (5) or using face-sheets to be detached by a third party. After they are collected, information and records can be protected during maintenance and retrieval by techniques such as deleting identifiers; random error injections (6) and microaggregation. (7) When data sets are interlinked, link-file brokerages (8) can be used and direct linkage reestablished under statistical safeguards such as error injection or microaggregation; or by statistical matching; or file linkages can be mutually insulated. These techniques can be particularly useful in longitudinal studies, (9) where the data must include identifiers.
Suppression and contamination techniques include eliminating small cells, collapsing classifications, and injecting random error. Many statistical agencies routinely use these techniques in screening data for publication.
When the Bureau of the Census, for example, prepares tabulations and tapes for public use, it employs elaborate screening procedures, including the suppression of geographical identifiers and limiting the detail of small samples, to prevent disclosure of individual identities by way of crossclassifications.
All agencies and organizations that collect individually identifiable data for research and statistical purposes should continually strive to improve their techniques for minimizing the amount of identifiable information collected, removing identifiers as soon as possible after the data have been processed, and protecting the links between personal identifiers and the data in their files. To assist them, the Commission recommends:
That the National Academy of Sciences, in conjunction with the relevant Federal agencies and scientific and professional organizations, be asked to develop and promote the use of statistical and procedural techniques to protect the anonymity of an individual who is the subject of any information or record collected or maintained for a research or statistical purpose.
CONDITIONS FOR USE OF INDIVIDUALLY IDENTIFIABLE RECORDS
The growing practice of making individually identifiable information available to the research community increases the risk of unauthorized use or inadvertent disclosure. To block this flow of information would paralyze a great many socially valuable research and statistical activities and increase the cost of the others. It would quickly increase the reporting burden on the public to intolerable proportions. The Commission's concern is neither to augment nor hinder the flow of individually identifiable information, but to establish safe limits to it.
Given the basic principle of functional separation and the recommended standards and procedural safeguards, the next issue is how to protect the individually identifiable information that was collected for other purposes when it is used for research and statistical purposes. When administrative records are made available for research and statistical uses, the principle of functional separation is as basic as in other flows of individually identifiable information and adequate safeguards and mechanisms for assuring accountability for the maintenance of confidentiality are equally essential.
Researchers and statisticians often request access to administrative records, and less often request access to previously collected or compiled research and statistical data. The two types of access requests must be considered separately because of the difference in the assumptions about confidentiality under which each is collected. The recommendations in this section are for modification in Federal agency disclosure practice with respect to these two kinds of requests, and for making contractors and grantees more accountable for data security in disclosures both to and by them.
USE OF ADMINISTRATIVE RECORDS
Researchers and statisticians use administrative records in a variety of ways. One of the Bureau of the Census' duties is to study revenue sharing and voting rights, and for this it draws information from sources such as records of automobile registrations and births and deaths. Because administrative or program records cover all the individuals in a defined population, another important use of them is for drawing statistical samples from groups such as participants in a manpower training program, military personnel, hospital patients, veterans, Medicaid recipients, retired persons, taxpayers, or students. The Bureau of the Census, for example, draws from Internal Revenue Service (IRS) records the names and addresses of all taxpayers who report farm income. It uses this list to conduct its Census of Agriculture, mailing survey forms to everyone on the list, thus creating a new sampling frame. (10) The Department of Defense draws samples from its own personnel records for surveys of military personnel characteristics and Armed Forces manpower potential. (11)
Researchers also use the records of other programs to enrich the information in their own records. Different programs record different kinds of information about clients, so that matching records of different programs about a given sample made up of individuals who participate in all or some of them gives a more complete picture of the sample. The Department of Labor, for example, may give the Social Security Administration (SSA) its records on a sample of individuals who have completed manpower training programs to help the Department find out how manpower training affects the individuals' earning capacity.
A third research use of administrative records is in secondary analysis of research data. For example, the Bureau of the Census bases its studies of population migration on research records produced by its own surveys and censuses. If it decides to study commuting patterns of persons living in particular metropolitan areas, it would reanalyze its own research records to extract the necessary information for a new sample of individuals, but would need to update some particular items of information, such as residence or employer ZIP codes, from the more current records of some other agency. It might, therefore, request access to the administrative records on the individuals in its sample held by perhaps the IRS or the SSA.
REUSE OF RESEARCH AND STATISTICAL RECORDS
The arguments in favor of reusing research and statistical information and records in individually identifiable form for research and statistical purposes are analogous to those for the use of administrative records for the same purposes. Secondary analysis of data sets can not only reduce the reporting burden on the public, but also can add new knowledge about social processes. It is also cumbersome, expensive, and in many cases impossible to replicate an already existing body of data. Moreover, secondary analysis of data can be a valuable verifier of findings originally derived from them. Finally, research and statistical information to individually identifiable form can be reused to match two or more data sets to gain more information than each singly provides, to draw samples, and to recontact individuals in the original sample for a longitudinal study of physical, social, or attitudinal change over time.
Nevertheless, the Commission urges caution. Here, as elsewhere, the flow of individually identifiable information can erode the public's willingness to cooperate voluntarily in data collection, and here, more than elsewhere, it is easy for researchers to forget the promises of confidentiality made to data subjects at the time information was collected.
The Commission believes that unless it is essential to recontact the individuals who took part in an earlier study, disclosure of information in individually identifiable form should be strictly limited to cases where public need clearly overrides private rights and then only after careful policy review of each case. This constraint need not unduly hamper research and statistical activities because, for the great bulk of them, anonymous microdata are as useful as individually identifiable data.
CONDITIONS FOR USE AND DISCLOSURE
A mosaic of statutory rules governs access to and disclosure of Federal agency records at present. The Federal Reports Act, designed primarily to minimize respondent burden, permits an agency to share information collected by other agencies, subject to certain constraints. It also minimizes duplication of effort by requiring the Office of Management and Budget (OMB) to review and approve the forms used in collecting information from individuals, whether the information is for program administration purposes or for research and statistical purposes. The OMB is also authorized to designate one agency as the sole collection agent for some types of information. The OMB has seldom exercised this authority, but recently used it to designate the Bureau of the Census as the sole collector of the population statistics needed for allocating some benefits such as revenue sharing.
Release of an agency's records is also governed by its own confidentiality statutes, if any, and by general statutes, such as the Privacy Act and the Freedom of Information Act. As noted above, however, a Federal agency can have substantial freedom to set its own threshold conditions for disclosing individually identifiable data from its records. Subsection 3(b)(1) of the Privacy Act permits disclosures within an agency on a need-to-know basis without reference to the original purpose of collection, and more than one executive department has arbitrarily broadened the resulting potential for circulation of data by defining itself as a single entity. Subsection 3(b) allows disclosures for "routine uses" outside even those organizational boundaries. In general, an agency may not disclose records outside the agency except " . . . for a purpose which is compatible with the purpose for which . . . [they were] originally collected." (5 U.S. C. 552a(a)(7)] However, since the Act does not define "compatible," the office, bureau, center, or institute within the agency which actually maintains the records has a substantial latitude in deciding what purposes meet the test. In addition, the Freedom of Information Act may require an agency to comply with requests to disclose that it would refuse if it could.
In practice, agencies do not generally allow researchers and statisticians unrestricted access to their records. Some are tightly restrained from doing so by statute. The Bureau of the Census confidentiality statute, for example, permits only Bureau officials and employees to examine individual records. Other agencies interpret the Privacy Act compatibility test narrowly, while still others that have statutory discretion to do so often release data to their own contractors under the "routine use" provision of the Privacy Act. Only one agency appears to release individually identifiable data not only to its own grantees, but also to other researchers. Thus, the pool of researchers and statisticians who have in fact received individually identifiable information from Federal agencies is composed almost entirely of Federal agency employees, contractors, some grantees, and a relatively small number of people who have neither contracts nor grants. Even these groups do not ordinarily get full access to records. The typical disclosure is a sample list of names and addresses. An agency's disclosure problems are complicated by the fact that it can sponsor research by contract or by grant. (12) Most agencies follow different disclosure policies depending upon whether the request for information comes from a contractor or a grantee but, as noted above, that simple distinction is not always valid as far as fair information practices are concerned.
Whenever an agency discloses individually identifiable information under the Privacy Act's "routine use" provision to a researcher whose procedures it controls, the funding instrument should contain safeguards for the released information, and these should be the same whether the request is from a contractor or a grantee. In practice, agencies differ in the safeguards they require in these cases. At the completion of the research project, for example, some contracts explicitly require that all identifiers be expunged from the records retained, others require that all data be returned to the agency, and some remain silent on this point. There are, in short, numerous ambiguities in the way disclosures of individually identifiable information are now regulated. If the research community is to have access to already existing individually identifiable information without endangering the privacy of data subjects, these ambiguities will have to be cleared up. and the conditions of disclosure made more explicit.
The Commission is well aware that opinions vary widely on how important given research endeavors are to society, and hence, how much disclosure is warranted in any instance. Nonetheless, there are minimum conditions that should be met before any disclosure or use of individually identifiable records for research and statistical purposes is permitted, and these conditions should be set by statute. First, the applicant must demonstrate a vital need for individually identifiable data to achieve its proposed research or statistical purpose. Second, in assuring responsibility for maintaining proper safeguards, a responsibility obviously shared by both the applicant for and the provider. of information, the Commission believes that the provider of information has the prime obligation for assuring that the conditions of disclosure are met by the receiving body. The provider can meet this obligation by stipulating the conditions for releasing the information and requiring the receiver to agree in writing to honor them, subject to criminal or other sanctions. In all cases, the user should be accountable to the agency responsible for the collection of the data. In addition, if the research purposes include recontact of data subjects, protection of the expectation of confidentiality under which the information was originally collected should be made a condition of disclosure.
Therefore, the Commission recommends:
That unless prohibited by Federal statute, a Federal agency may be permitted to use or disclose in individually identifiable form for a research or statistical purpose any record or information it collects or maintains without the authorization of the individual to whom such record or information pertains only when the agency:
The above recommendation holds the disclosing agency accountable for assuring that the individually identifiable information it releases for research and statistical purposes is used responsibly. This presents no problem when an agency discloses information to a contractor or grantee supported by the agency itself. The situation is more complex when a contractor or grantee funded by one agency needs access to information maintained by another agency. To clarify the chain of accountability in these instances, disclosure should be contingent on an agreement by the agency funding the research or statistical activity to take prime responsibility for assuring that the user satisfies the conditions under which the information is released. Therefore, the Commission recommends:
That when disclosure pursuant to Recommendation (7) is made to a Federal contractor or grantee, the written agreement should be between the disclosing agency and funding agency, with the latter responsible for assuring that the terms of the agreement are met.
Recommendations (7) and (8) are designed to regulate disclosure by Federal agencies, but can be applied as well to Federal contractors and grantees when they are asked to disclose individually identifiable information. Under existing law, individually identifiable information collected by grantees is not subject to the provisions of the Privacy Act. The granting agency may require safeguards for such information, but any obligation to do so is a matter of administrative policy or regulation.
Individually identifiable information collected by contractors in performance of their work for an agency is also not necessarily subject to the Privacy Act. The Privacy Act states:
When an agency provides by a contract for the operation by or on behalf of the agency of a system of records to accomplish an agency function, the agency shall, consistent with its authority, cause the requirements of this section to be applied to such system . . . . [5 U. S. C. 552a(m)]
Some agencies interpret this clause to mean that contractors may not collec information about individuals under conditions that are less confidentia than the conditions applying to records maintained by the agency itself. In a May 1976 memorandum, however, the General Counsel of DHEW interpreted it to mean that, in performing this kind of work, contractors are comparable to grantees, and that
. . . the requirements of the Privacy Act of 1974 are not applicable to HEW research and other contracts which call for the contractor merely to furnish to the HEW contracting agency statistical or other reports, even though it is necessary for the contractor to establish a system of records to perform the contract. (13) Although contractor records compiled under these conditions are not thought to be subject to the Privacy Act, the memo advises DHEW contracting officers to incorporate into contracts, where appropriate, " . . . the provisions designed to protect the confidentiality of the records and the privacy of individual identifiers in the records."
These differences in agency interpretation of obligations under the Privacy Act, and the lack of any explicit policy concerning the protection of individually identifiable data collected under a grant, make for a less than satisfactory disclosure situation. This is especially true since, as noted before, the agencies' existing disclosure policies are complicated by differing confidentiality provisions in statutes and the several methods of procuring research. Yet, where individually identifiable information is collected and maintained as a consequence of Federal funding, there is a corresponding obligation on the part of the Federal government to maintain accountability to the individual. Individually identifiable information held by a contractor or grantee should be disclosed only when the recipient can be held accountable for any violation of the individual's right to have identifiable data about him shielded from improper use or disclosure. For this, an additional accountability mechanism is necessary and the Commission, therefore, recommends:
That any person, who under Federal contract or grant collects or maintains any record or information contained therein for a research or statistical purpose, be prohibited from disclosing such record or information in individually identifiable form for another research or statistical purpose, except pursuant to a written agreement that meets the specifications of Recomniendations (7) and (8) above, and has been approved by the Federal funding agency.
PROTECTIONS TO BE INVOKED BY THE INDIVIDUAL
Once individually identifiable research and statistical data are insulated from all other types of use by adequate safeguards and standards of accountability, there remains only the question of what the individual can do for his own protection, or have done on his behalf. Specifically, the following questions about the role of the individual need to be answered.
These questions reveal important differences between information collected directly from the individual for a research or statistical purpose and information extracted from administrative records for the same purpose. It may be useful to begin by examining the individual's role as prescribed by the Privacy Act of 1974, and then to consider whether the same role for the individual will suffice when the information is for research or statistical use.
A goal of the Privacy Act is to permit an individual to monitor an agency's collection, use, and dissemination practices with respect to information about him in its possession by giving him access to the information about him in agency files, and an opportunity to challenge errors. In addition, the Privacy Act specifies that the individual has the right to learn of disclosures to others, with a few significant exceptions (i.e., internal agency uses, disclosures made pursuant to the requirements of the Freedom of Information Act, and disclosures to law enforcement agencies). The Privacy Act's main mechanisms for giving the individual some control over government use of the information it collects about him are notice and authorization. At the time information is collected from an individual he must be notified of the authority under which it is being collected and whether his response is mandatory or voluntary, and also of the purposes for collecting the information and the uses to which it will be put. He must also be told the consequences of not supplying the information. [5 U.S.C 552a(e)(3)] The agencies must give public notice annually of their existing record systems, [5 U.S.C 552a(e)(4)] although the OMB guidelines have modified this requirement so that after an agency has published its initial list of record systems it need only report any new ones or any changes in existing systems. An agency must advise an individual, on request, if it maintains any records on him and, on request, allow him to examine the records on him and challenge their accuracy. [5 U.S. C. 552a(d)] The record of an individual may not be used for purposes other than those for which it was collected without the individual's consent except as expressly authorized by any one of the 11 exceptions in the Privacy Act. [5 U.S.C. 552a(b)] One such exception, as noted earlier, is for a "routine use," which, being open to interpretation, diminishes the efficacy of the Act's authorization requirements.
Notice and authorization work together, but not in the same way. Generally speaking, the intent of the Privacy Act notice requirements is to assure that the individual, knowing in advance the purpose for which the information is collected and what uses may be made of it, can refuse to cooperate if his participation is voluntary, or challenge the questioner's authority if it is not. The Privacy Act assumes that any reasonable individual will expect that the information he contributes may be disclosed for auditing use, or in response to court order or legislative inquiry, or for tax and other law enforcement purposes. The assumption, however, is seldom justified when information is collected for a research or statistical purpose. It is not reasonable to expect the subject to know that the data he supplies for a research or statistical purpose may be disclosed in response to compulsory process or may enter into an administrative decision pertaining to him. Thus, unless there is functional separation of research and statistical uses of information from administrative uses, the notice requirements of the Privacy Act are clearly inadequate.
Assuming that functional separation can be established as recommended by the Commission, the question of how much control the individual should have over how information pertaining to him is used in research or statistical activities remains, as well as the question of how that control should be exercised.
INFORMATION COLLECTED FOR RESEARCH AND STATISTICAL PURPOSES
Individually identifiable data about individuals flow into research and statistical record systems, as into other record systems, from several sources. Some is obtained directly from individuals by means of questionnaires, interviews, and other methods of systematic inquiry, such as controlled experiments, sometimes with the individual's full knowledge and sometimes without it. For example, data are sometimes collected from persons not fully competent to understand the collection process, while other information is extracted from administrative or program records, or supplied by third parties.
In considering an individual's control over information about him when it is used for research and statistical purposes, the important distinction is between the information researchers and statisticians get from him directly and that which they get from him indirectly by culling it from administrative files. The distinction is important because of the difference in the individual's expectation of confidentiality. When asked to contribute information for administrative purposes, he can reasonably expect that his contribution will enter into administrative decisions about him and act accordingly, but when asked to contribute it for research or statistical purposes, he is not likely to anticipate any uses other than the ones stated by the questioner.
When an individual is asked to provide information for a research or statistical purpose, he should, in all fairness, have a reasonable idea of the consequences to him of agreeing or of refusing to answer. Minimally, this means he must be told that he can refuse if he chooses, and informed of the purpose and nature of the data collection, and the extent to which the information he supplies will be disclosed further in individually identifiable form. Accordingly, as a supplement to notice requirements already embodied in the Privacy Act (14) the Commission recommends:
That absent an explicit statutory requirement to the contrary, any Federal agency that collects or supports the collection of individually identifiable information from an individual for a research or statistical purpose be required by Federal statute to notify such individual:
Some research involves children or people of diminished mental competence; other research involves population groups, such as prisoners, whose circumstances compromise their freedom to choose whether or not to participate. There are also research experiments so designed that the validity of the findings depends on the participants' ignorance of some aspects of the research, and sometimes even of the fact that they are participating in research. To create special protection for such data subjects, the Commission recommends an institutional review process. The Commission recognizes the difficulty of creating institutional review boards where they do not now exist, and holding a Federal agency accountable for the actions of those collecting information for research or statistical activities on its behalf as well as for its own actions. The Commission's intent is not to specify how institutional review is to be established, but rather to make the point that the safeguards that enable an individual to protect himself must be applied to the individual who, for one reason or another, cannot take advantage of them on his own initiative. Accordingly, the Commission recommends:
That Congress provide by statute that when information about an individual is to be collected in individually identifiable form for a research or statistical purpose by a Federal agency or with Federal funding, an institutional review process be required to apply the principles enunciated in Recommendation (10) in order to protect the individual:
In this context, the Commission observes that although its mandate is confined to protecting the interests of research subjects with respect to information and records about them generated by research and statistical methods, its broad concern is with protecting the more general rights and welfare of human research subjects. The Department of Health, Education, and Welfare, as the Federal agency sponsoring the bulk of such research, has, since 1966, taken the lead in this area by issuing guidelines and regulations setting conditions designed to control research on human subjects. Recent action by the Congress, furthermore, portends even wider ramifications. For example, the National Research Act of 1974 [P.L. 93348] establishes a National Commission for the Protection of Human Subjects in Biomedical and Behavioral Science Research (NCPHS) with a mandate to define the ethical principles of such research and recommend policies for assuring that the research does not violate ethical principles in practice. Among other things, the National Research Act provides for making the NCPHS recommendations applicable to all Federal agencies, and for establishing a National Advisory Council to monitor the protection of human subjects after the NCPHS completes its task.
The DHEW regulatory activities to protect human research subjects have focused on the institutional responsibility of the organization that actually conducts the research. Under current policy, no DHEW extramural research involving human subjects may be undertaken unless a committee known as an institutional ethical review board has assured DHEW that it has reviewed the proposed research design and determined whether human subjects will be placed at risk, and if so, that: the risks are outweighed by the sum of the benefits to the individual and the importance of the knowledge to be gained; the rights and welfare of the subjects will be adequately protected; legally effective informed consent will be obtained from each participant; and the conduct of the research will be reviewed at timely intervals.
The NCPHS does not expect to issue its final recommendations until the end of 1977. It is already clear, however, that institutional review committees will continue to have the prime responsibility for protecting human subjects. Consequently, where institutional ethical review boards do not already exist pursuant to DHEW regulations, there is every likelihood that they will soon be established pursuant to recommendations of the NCPHS. When this happens, the existing boards and the newly created ones will provide a suitable vehicle for carrying out Recommendation (11).
INFORMATION COLLECTED FROM ADMINISTRATIVE RECORDS
From the standpoint of protecting the individual, the two most pertinent questions about the use for a research or statistical purpose of individually identifiable information drawn from administrative or other records are: whether information ostensibly collected for administrative purposes is actually being collected for research and statistical purposes without the individual's authorization; and whether delivery of program or other benefits should be contingent on the individual's willingness to have administrative information about him also used for a research or statistical purpose?
With respect to the first question, the Commission believes that while research and statistical "piggy-backing" on administrative data collections does perhaps occur more often than necessary, the measures recommended by the Commission will protect the individual from having additional information generated by research activities used to his detriment. The first question will then be less important than it is now.
The second question presents a more difficult problem, since the answer depends on balancing the individual's right to control the collection and use of information pertaining to him against the society's need for knowledge. The preceding recommendations recognize the societal utility of information generated by research and statistical activities, and the extent to which the continuing productivity of these activities depends on access to administrative records by allowing individually identifiable data in administrative records to be disclosed for research or statistical purposes under appropriate safeguards.
Additional protections for the individual about whom information in administrative records is used for a research or statistical purpose are not necessary because the measures in the preceding recommendations will be adequate. Research and statistical activities can safely be spared the costly burden of obtaining the authorization of each individual if adequate notice is given when the information is collected for administrative records in the first place. The individual will then realize that the information he supplies for administrative purposes may also be used for research or statistical purposes, and that he may be contacted by a researcher. Accordingly, the Commission recommends:
That Congress provide by statute that when individually identifiable information is collected from an individual by a Federal agency or with Federal funding for a purpose other than a research or statistical one, the individual be informed that:
INDIVIDUAL ACCESS TO RESEARCH AND STATISTICAL RECORDS
The right given an individual by the Privacy Act to see and copy a record maintained about him and to challenge the information in the record recognizes that the individual has a role to play in decisionmaking processes that affect him. Records that are dedicated by statute solely to research or statistical use may be exempted from the general right of access and challenge because, unlike administrative records, they are not used for making decisions about individuals. If information in research or statistical records cannot be disclosed in individually identifiable form for any other purpose, the individual need have no great concern about it. Unless such records can be totally protected against the possibility that individually identifiable information in them will be disclosed for any other purpose, the individual's concern is obvious and his access right highly relevant.
Two points are important. First, it is important for the individual to retain a measure of control over individually identifiable research or statistical information pertaining to him because he needs some way of finding out who else gets the information. Second, whether an individual needs to have access to records maintained about him for research or statistical purposes depends on how well these records can be kept separate from other uses. If separation is not maintained, and the information is in fact disclosed in individually identifiable form for other than a research or statistical use without a guarantee that the disclosure will not affect the individual, fairness demands that the individual be informed of the disclosure and to whom it was made, and be given a right of access to the record. Accordingly, the Commission recommends:
That Congress provide by statute that if any record or information contained therein collected or maintained by a Federal agency or with Federal funding for a research or statistical purpose is disclosed in individually identifiable form without an assurance that such record or information will not be used to make any decision or take an action directly affecting the individual to whom it pertains (e.g., to a court or an audit agency), or without a prohibition on further use or disclosure, the individual should be notified of the disclosure and of his right of access both to his record and to any accounting of its disclosure. (15)
No single vehicle is adequate to carry out all of the Commission's recommendations in this chapter. Thus, the Commission has chosen a strategy which encompasses amendments to the Privacy Act of 1974, other legislative action, and voluntary compliance on the part of national study organizations.
The Commission feels that the principle of functional separation (Recommendation (1)) can be established by amending the Privacy Act. (16) The first set of steps necessary to apply that principle to Federal and federally assisted research, namely, establishing appropriate uses and disclosures for research and statistical records (Recommendations (2) and (3)) can best be implemented through a new Federal statute to provide a common line of minimum protection for the confidentiality of Federal or federally assisted research and statistical records.
The second set of steps necessary to apply the principle of functional separation-namely, establishing procedures for protecting the confidentiality of individually identifiable data-seeks to establish a consistent set of safeguards among Federal agencies and their contractors and grantees. Recommendations (4) and (5), which would achieve this objective, can be implemented through amendment of the Privacy Act. In addition, the Commission believes that new techniques for collecting, maintaining, and using records about individuals in ways that avoid personal identification ought to be developed and promulgated, and, therefore, recommends that the National Academy of Sciences voluntarily take the lead in doing so.
The third set of recommended steps, establishing the conditions of disclosure for individually identifiable information to be used for a research or statistical purpose, seeks to assure that a common set of conditions are met in a consistent and accountable way by Federal agencies and their contractors and grantees (Recommendations (7), (8), and (9)). These recommendations can be implemented through amendments to the Privacy Act of 1974 which currently sets minimum conditions for the use and disclosure of Federal records.
Recommendations (10) through (13) address the role of the individual in protecting himself and focus on notice and access. Recommendations (10), (11), and (12) which deal with notice, and Recommendation (13), which deals with access, can best be implemented through amendment to the Privacy Act. As pointed out in the earlier discussion of Recommendation (11), however, the Commission did not specify how the institutional review the recommendation would require should be established or what the required steps in the review process should be. The Commission urges that the National Commission for the Protection of Human Subjects in Biomedical and Behavioral Research incorporate Recommendation (11) into the mandate of the institutional review process it will recommend for all Federal agencies and also that Federal agency regulations implementing the Privacy Act incorporate the National Commission's recommendations.
The 13 recommendations in this chapter collectively provide a means of protecting personal privacy in research and statistical activities conducted or sponsored by the Federal government. The Commission's findings lead it to present for consideration to other research communities the following nine policy guidelines which it hopes will be voluntarily adopted by all those who conduct research and statistical activities. The Commission also believes that they could help to shape any State legislation in the field. The fundamental principle for the guidelines, as for the recommendations in the previous sections of this chapter, is that of functional separation-insulating the use of individually identifiable information for research and statistical purposes from all other uses. These guidelines follow the precepts in the Commission's recommendations.
Any record or information contained therein collected or maintained for a research or statistical purpose should not be used in individually identifiable form to make any decision or take any action directly affecting the individual to whom the record pertains, except within the context of the research plan or protocol, or with the specific authorization of such individual; and
That based on the foregoing principle, a special set of information practice requirements should be established for records and information contained therein collected or maintained in individually identifiable form for a research or statistical purpose.
Great care is needed to protect individually identifiable information from unauthorized or inadvertent disclosure. The Commission is persuaded not only that full technical, administrative, and physical safeguards must be established to protect confidentiality, but also that information should be rendered anonymous by being stripped of identifiers as soon after collection as possible.
Any entity that, for a research or statistical purpose, collects or maintains in individually identifiable form any record or information contained therein should be required:
Once the principle of functional separation is accepted, and adequate mechanisms for implementing it are in place, individually identifiable information can safely be disclosed for research and statistical purposes provided certain minimal conditions are met.
Except where specifically prohibited by law, an entity that collects or maintains a record or information may use or disclose in individually identifiable form either the record or the information contained therein for a research or statistical purpose without the consent of the individual to whom the record pertains, provided that the entity:
The remaining six guidelines are for the further protection of individual data subjects from unfair collection practices, and to assure individual access whenever the principle of functional separation cannot be upheld.
The Commission believes it advisable that the fair information practice principles established by the Privacy Act of 1974, and supplemented by Recommendation (10) above, be extended to include individuals who supply information for research and statistical activities that are independent of the Federal government.
Absent an explicit statutory requirement to the contrary, no individual should be required to divulge information about himself for a research or statistical purpose. To assure that there is no coercion or deception, the individual should be informed:
Individuals whose consent to participate in a research or statistical project cannot be given because of youth or disability or because the research design precludes it, and individuals whose circumstances coerce their participation need extra protection.
When information about an individual is to be collected in individually identifiable form for a research or statistical purpose, an institutional review process or responsible representative should be required to apply the principles enunciated in Guideline (4) in order to protect the individual:
When individually identifiable information collected in the first instance for some other purpose is used for research and statistical purposes, it needs special attention.
When individually identifiable information is collected for a purpose other than a research or statistical purpose the individual should be informed:
So long as all individually identifiable information used for research and statistical purposes is kept separate from use for any other purpose, the individual data subject does not need access to the record. When the information cannot be protected from use for other purposes, the individual should have a right of access.
When research or statistical records or information are collected and maintained in conformity with all the foregoing policy recommendations, an individual should have a right of access to a record or information which pertains to him if such record or information is used or disclosed in individually identifiable form for any purpose other than a research or statistical one (e.g., an inadvertent unauthorized disclosure).
Fairness demands that individuals have a way of finding out, if they wish, what disclosures of individually identifiable information about them have been made.
Any entity that collects or maintains a record or information for a research or statistical purpose should be required to keep an accurate accounting of all disclosures in individually identifiable form of such record or information contained therein such that an individual who is the subject of such record or information can rind out that the disclosure has been made and to whom.
The importance to an individual of access to information used for research and statistical purposes depends on the extent to which the information can be kept separate from use for other purposes.
If any record or information contained therein collected or maintained for a research or statistical purpose is disclosed in individually identifiable form without an assurance that such record or information will not be used to make any decision or take an action directly affecting the individual to whom it pertains, or without a prohibition on further use or disclosure (e.g., to a court or an audit agency), the individual should be notified of the disclosure, and of his right of access to the record and to the accounting for its disclosure, as provided by Guidelines (7) and (8) above.
1. For example, the Federal Reports Act [44 U.S.C. 3501-3511] provides central structure for the disclosure of Federal agency records. Disclosure is also regulated by the Privacy Act ]5 U.S.C. 552x], the Freedom of Information Act [5 U.S.C. 552], and by specific confidentiality statutes regarding alcohol and drug abuse treatment records [42 U.S.C. 582 and 21 U.S.C. 1175].
2. Testimony of the National Center for Health Statistics, Medical Records, Hearings before the Privacy Protection Study Commission, pp. 54-56.
3. See Richards of Rockford, Inc. v. Pacific Gas & Electric Company 71 F.R.D. 388 (N.D. Cal. 1976); also, Branzburg v. Hayes 408 U.S. 665 (1972) where the Supreme Court held that a Grand Jury, given its unique powers of inquiry and pledge of secrecy could compel a reporter to disclose information that was "relevant and material to a good-faith grand jury investigation."
4. Federal Committee on Statistical Methodology.
5. "Randomizing responses" is a process whereby the respondent is given two questions of which he selects one to answer on a random basis without revealing which question he answered. The researcher can estimate proportions through statistical methods that reflect the incidence of the response to the sensitive question in the population.
6. "Random error injection" is the innoculation of error into a report on a random basis but where the general character of the error is controlled by the researcher so that it is possible to estimate statistical parameters from a large sample even though it is not possible to tell if any given response is accurate.
7. "Microaggregation" is the process of creating many synthetic average persons and releasing data on those rather than on real individuals.
8. A "link-file system" is a system that maintains subject identifications in a file separate from the individual data file and which allows the linkage of subject identities and data about the individual in one or more files through codes that carry no individual identification. Brokerage refers to the maintenance of the link file by an unrelated third party whose sole function is to keep the identity of the record subject anonymous to the record collector and user.
9. A "longitudinal study" involves tracking a group of individuals over time to establish how the state of that group varies and the average relation between an individual's state in one point of time and his state at another point in time.
10. This exchange of information is described in detail in Appendix B to a U.S. Department of Commerce report entitled "The Use of Tax Data in the Structuring of Basic Economic Tools," November 4, 1974.
11. Submission of Department of Defense, Research and Statistics, Hearings before the Privacy Protection Study Commission, January 5, 1971.
12. It must be noted that the grants discussed here are Federal discretionary grants, no formula or block grants, which involve Federal-State issues beyond the scope of these, recommendations.
13. Memorandum from General Counsel William H. Taft III to John Ottina, Assistant Secretary for Administration and Management, Department of Health, Education, and Welfare, regarding the application of the Privacy Act to DHEW contractors, May 14, 1976.
14. The Privacy Act already requires that an individual be told whether his participation is mandatory or voluntary and the purposes and nature of the data collection. (5 U.S.C. 552a(e)(3)]
15. 15 The Privacy Act already requires an accounting of such disclosures. [5 U.S.C. 552a(c)]
16. 16 See Note 2, Chapter l3.