Mandated by the legislature as a transparency measure in the highly secret process of electronic surveillance, the annual California Electronic Interceptions Report is a wellspring of information for criminal justice research. But this year, the California Department of Justice (CADOJ) says that, from here on out, these reports—and potentially all of its criminal justice data—will only be issued as locked PDFs, significantly limiting the public’s ability to analyze the information in alternative formats.
California Attorney General Kamala Harris' new policy is a slap in the face to transparency and is a step in the opposite direction of the nationwide trend to embrace open data.
The 2014 California Electronic Interceptions Report, released last month, clocks in at 168 pages, with data on electronic surveillance from around the state presented in a series of complex tables, some spanning more than 30 pages. For each wiretap, the document outlines how many people were affected, how many communications were intercepted, the costs of the surveillance, and the number of arrests and the amount of property and drug seized as a result of the investigation.
Among the highlights:
- California law enforcement agencies filed 971 wiretap applications in 2014, a increase of more than 44 percent compared to 2013.
- Wiretap orders led to approximately 480 arrests, the largest portion of which were drug related. Only 41 people were convicted in 2014 as a result of that surveillance.
- Riverside County remains the leader in wiretaps in the state, with 624 orders filed in 2014. That’s far more than every other reporting county combined. That’s also more than four times the number of wiretaps applied for by Los Angeles County, the state’s most populous county.
- Wiretaps in California in 2014 cost a total of $31 million, of which $28 million was spent on personnel and $3.1 was spent on equipment, supplies, and installation fees. This represents a 17% increase over 2013.
- 35 counties, including San Francisco, San Mateo, and Santa Cruz, reported filing no wiretap applications at all.
This information can be spotted with the naked eye, but much more information would be available if the researchers could analyze the data in a machine-readable format.
CADOJ offers little explanation regarding the massive expansion of wiretaps in the state, providing only a single page of cheerleading for all the drug trafficking seizures and arrests reported by law enforcement. In this introduction to the report, CADOJ staff recommends that the sprawling tables “should be read in conjunction with one another to evaluate the impact intercepts have on public safety.” However, the department's decision to published the document as a locked PDF impedes researchers’ ability to conduct exactly this type of impact analysis.
Last year, when EFF filed a California Public Records Request for the raw electronic interceptions data, CADOJ anticipated it would be extremely time-consuming to export. Instead, EFF and CADOJ agreed on an expedient compromise: it would provide EFF with the Microsoft Word document version of the reports, from which it would be much easier for to extract the data.
This year, we filed a CPRA request with the CADOJ requesting the data either in a spreadsheet format or on the same terms as before. No deal, they said:
… our Office has changed its security protocol regarding reports and other documents that are made available electronically to members of the public on our public web site. Now, all such reports and documents appearing on our public website are only made available to members of the public in a locked PDF format. We have made this change in order to better protect the security and integrity of the data in our public records.
This new policy position will have significant ramifications for public access to criminal justice data across the board. The position also sets a precedent for local law enforcement around the state to make it more difficult for the public to access data.
It is also wrong as matter of law. In California, state agencies are required to produce records in “any electronic format in which it holds the information.” But the CADOJ is citing a section of the law that says agencies don’t have to hand over records in electronic format that would “jeopardize or compromise the security or integrity of the original record.”
We formally asked CADOJ to explain how, exactly, providing either a Word document, spreadsheet, or other data file jeopardizes the security or integrity of the data any more than publishing a PDF. After all, a PDF can be as easily doctored as any other file.
A month later, CADOJ has yet to respond.
Right now, it would take significant expertise to scrape all the data from the electronic intercept reports from a PDF while maintaining the accuracy of the information. When we asked Steven Rich, database editor for investigations at the Washington Post, for his evaluation, he wrote back:
It's possible to get the data out of the PDF but it's an amazing amount of work to get it in a usable form. This is an insanely difficult format, given that the file, based on the metadata, came out of Word. The only format worse than a PDF in this case is a scanned PDF.
If the California Attorney General were to release the data openly, it would provide the public with a variety of ways to view how wiretaps are conducted in California. For example, the public could learn:
- in aggregate, the number of people whose communications were intercepted across the state;
- in aggregate, the number of communications that were intercepted across the state;
- the total percentage of communications that were actually incriminating, versus communications that were irrelevant;
- the number of wiretaps in which the agency did not provide any information required by law; and
- trends in how wiretap use compares year over year, county by county.
Open data would also allow for outside researchers and organizers to create interactive systems for searching and analyzing the data, which could uncover many more interesting trends and anomalies and create new opportunities for public oversight of the criminal justice system.
The California Attorney General’s office ought to rethink its policies immediately. The state legislature is currently considering new data collection powers for CADOJ regarding issues such as racial profiling and police use of force—most of which declare from the get-go that these records should be public.
Rather than worry about the integrity of the data, CADOJ should be worrying about its own integrity when it come to transparency in California’s criminal justice system.