Open Data in the US: Navigating the Privacy Minefield

 

Open Data and Privacy Policies:
 North Carolina
How can we be rigorous and thorough and
at the same time protect citizen privacy?
This is not about Raleigh's open data portal currently under construction. This is about the fractured and ad hoc privacy policies of open data initiatives in the United States. Privacy is complicated. In my conversations with policy folks at the Open Knowledge Foundation several topics around privacy came up. While none of them really solve the privacy issue for open data efforts in the US perhaps there can be work around. We as open data advocates in the US can begin to develop a set of principles. We can police our initiatives in the hope that privacy will one day catch up.

Open data initiatives in the United States are not, for the most part, governed by federal law. In a few cases, federal law does have remedies for certain kinds of data (think HIPAA). Most law affecting privacy happens at the state level which makes a coherent national model of privacy difficult.

In my own initiative I will be crafting what I hope can be at least a regional privacy model. But again, this is still fraught with complexity. Even if we completely redact publicly identifiable information from our open data portal how difficult is it for someone to ID an individual through cross-linked data sets?
The North Carolina Law and the Open Government Data Definition
Requests for Information, in Raleigh and North Carolina, can come in any form and from any citizen on just about anything. This makes following some of the principles in the Open Government Data Definition (OGD) somewhat difficult given that we must publish authoritative and complete data sets. In addition, our European open data colleagues often take a dim view of the inconsistent approach to how privacy is handled on American open data initiatives. I would like to work on a European model of privacy. My reasoning is most EU inititiaves seem to be past the point of privacy as a policy and now are focusing on privacy in practice. I would like to see privacy policies for open data move past the privacy disclaimers on most US government websites and develop a privacy template. This makes policy easier for open data practitioners here implement initiatives consistently until a US privacy standard emerges.

Below I have included some information regarding North Carolina state law governing confidential records:
North Carolina General Statutes Chapter 132. Public Records Law
https://www.scio.nc.gov/library/pdf/LawsRelatingToUseOfStateComputerSystems/N.C.G.S.%20132-1.%20Public%20Records%20Law.pdf

1321."Public records" defined

(a) "Public record" or "public records" shall mean all documents, papers, letters, maps, books, photographs, films, sound recordings, magnetic or other tapes, electronic data processing records, artifacts, or other documentary material, regardless of physical form or characteristics, made or received pursuant to law or ordinance in connection with the transaction of public business by any agency of North Carolina government or its subdivisions. Agency of North Carolina government or its subdivisions shall mean and include every public office, public officer or official (State or local, elected or appointed), institution, board, commission, bureau, council, department, authority or other unit of government of the State or of any county, unit, special district or other political subdivision of government.

(b) The public records and public information compiled by the agencies of North Carolina government or its subdivisions are the property of the people. Therefore, it is the policy of this State that the people may obtain copies of their public records and public information free or at minimal cost unless otherwise specifically provided by law. As used herein, "minimal cost" shall mean the actual cost of reproducing the public record or public information.


1326.2. Provisions for copies of public records; fees


(a) Persons requesting copies of public records may elect to obtain them in any and all media in which the public agency is capable of providing them. No request for copies of public records in a particular medium shall be denied on the grounds that the custodian has made or prefers to make the public records available in another medium. The public agency may assess different fees for different media as prescribed by law.


(b) Persons requesting copies of public records may request that the copies be certified or uncertified. The fees for certifying copies of public records shall be as provided by law. Except as otherwise provided by law, no public agency shall charge a fee for an uncertified copy of a public record that exceeds the actual cost to the public agency of making the copy. For purposes of this subsection, "actual cost" is limited to direct, chargeable costs related to the reproduction of a public record as determined by generally accepted accounting principles and does not include costs that would have been incurred by the public agency if a request to reproduce a public record had not been made. Notwithstanding the provisions of this subsection, if the request is such as to require extensive use of information technology resources or extensive clerical or supervisory assistance by personnel of the agency involved, or if producing the record in the medium requested results in a greater use of information technology resources than that established by the agency for reproduction of the volume of information requested, then the agency may charge, in addition to the actual cost of duplication, a special service charge, which shall be reasonable and shall be based on the actual cost incurred for such extensive use of information technology resources or the labor costs of the personnel providing the services, or for a greater use of information technology resources that is actually incurred by the agency or attributable to the agency. If anyone requesting public information from any public agency is charged a fee that the requester believes to be unfair or unreasonable, the requester may ask the State Chief Information Officer or his designee to mediate the dispute.


(c) Persons requesting copies of computer databases may be required to make or submit such requests in writing. Custodians of public records shall respond to all such requests as promptly as possible. If the request is granted, the copies shall be provided as soon as reasonably possible. If the request is denied, the denial shall be accompanied by an explanation of the basis for the denial. If asked to do so, the person denying the request shall, as promptly as possible, reduce the explanation for the denial to writing.


(d) Nothing in this section shall be construed to require a public agency to respond to requests for copies of public records outside of its usual business hours.


(e) Nothing in this section shall be construed to require a public agency to respond to a request for a copy of a public record by creating or compiling a record that does not exist. If a public agency, as a service to the requester, voluntarily elects to create or compile a record, it may negotiate a reasonable charge for the service with the requester. Nothing in this section shall be construed to require a public agency to put into electronic medium a record that is not kept in electronic medium.
My commentary on "Public records" in NC 132
Some commentary on what a record is. It is basically anything kept or curated by any department. This record is subject to disclosure laws with some interesting exceptions. A record may be examined by a state official from the Department of Cultural Resources at any time with no reason given. This is to safeguard the keeping of records within state and municipal agencies.


Public records that are not machine readable really don't help an open data initiative. Physical media may be the provenance of open government folks but open data is mostly concerned with re-use and readability.


Certain types of public records have been redacted from any type of public disclosure. These type of communications include: Confidential communications by legal counsel to public board or agency; State tax information; public enterprise billing information; Address Confidentiality Program information.

My Point on Public records and Requests for Information in NC 132So it would seem that any record not specifically redacted categorically by the state has to be reproduced (not created) in any manner convenient to the citizen. This dispersement of information does have to be initiated by the citizen.

Why does this matter? I am looking for a mechanism to adhere to the EU open data initiatives practices and ideals on citizen privacy while staying within the US governing law of my own open data initiative. It is walking a fine line. The EU (European Union) and European cultural values differ from the US on privacy issues. In The EU, there is a resistance and general distaste for the data-mining that happens in the US (from a discussion with OKFN Listerv members). Paradoxically there is a rise in alarmist commentary about sensational "data-dumping" of public records such as the now famous "Gun Map". US citizens are worried about data-mining and data-dumping.

What I want is an American model of open data that adheres to the emerging global norms and mores of what is and what is not found in an open data set.

Balancing Privacy with Public Records Law
There is no specific provision on how records are to be distributed to the people. The law requires that public records that do not fall into a prohibited category may be requested at any time by any person for any reason. No reason, in fact, has to be given. Where we may have some room to protect citizens from having their names and addresses published in an open data portal is in the trigger for the request for information. Note that a person has to request a record. The agency is under no obligation to format the request or formalize the request process. The law states the opposite. By definition all requests originate with the citizen and the agency can produce the record with whatever mean are at the agency's disposal. Further, the agency is under no obligation to create a record that does not already exist nor is it required to produce a record for which it is not the steward.

What we have here then is a work around. Open data is not an agency unto itself but is rather the function of the agency in which it operates. Open data provides only machine readable data sets for public consumption. The department within which I operate is Information Technology. This department is only the steward of data sets concerning IT operations and IT finances. We cannot execute data requests on behalf of other departments.

Open data initiatives in North Carolina can therefore:

Redact any personally identifiable information from a single data set.
Rigorously test for identifying a single individual through analyzing cross linked data sets.
Follow the Open Government Data Definition (OGD) subject to valid privacy and or security concerns wherein the OGD requires that "Data Must be Complete".
Follow the cultural values implied in OKFN publications such as "The Open Data Handbook".
Follow the guidelines of the Open Rights Group on open data privacy.
Provide a link to our "Request for Information" page on the city portal.

Comments

Popular posts from this blog

Podcast: Open Data Discussions with Anthony Fung

WHITE HOUSE OPEN DATA INNOVATION SUMMIT - WHAT I SAID, WHAT I MEANT TO SAY

Open Data Licensing