On July 1, 2016, a new California law took effect that requires all local government agencies across the state to publish an online inventory of "enterprise systems." In essence, the inventories are database catalogs, disclosing all the systems an agency uses as a primary source of records or to collect information on the public.
EFF, in partnership with the Data Foundation and the Sunlight Foundation, launched a crowd-sourcing effort to capture as many of these database catalogs as possible. Assisted by more than 40 volunteers on one Saturday afternoon in August, the coalition amassed more than 430 catalogs. In the days following our report, a dozen more agencies contacted us to provide links to their own database catalogs.
These catalogs are useful to public in a number of ways. For open data advocates, the new law—S.B. 272—represents an important step forward to releasing government datasets, since these catalogs also serve as a sort of menu of records that may be requested under the California Public Records Act, depending on the sensitivity of the data. From a privacy perspective, these catalogs also reveal the types of information that local governments are collecting on their systems, including potentially surveillance equipment and software.
Under the new law, each enterprise system must be listed with the following pieces of information:
- Current system vendor.
- Current system product.
- A brief statement of the system’s purpose.
- A general description of categories or types of data.
- The department that serves as the system’s primary custodian.
- How frequently system data is collected.
- How frequently system data is updated.
What We Learned About Compliance
While more than 430 agencies had posted database catalogs as required by law, an analysis of the documents found several issues worth being addressed by legislators or other policymakers.
- Enterprise systems maintained by sheriffs, district attorneys, assessor-recorders, and other independently elected county-level officials were inconsistently included in county database catalogs.
- There was wide disparity between agencies, and even between departments within the same agency, regarding how much detail is required regarding the statement of the system's purpose and the types of data stored. Some agencies provided a full paragraph, while others used only a few words.
- Agencies used different keywords and titles for the documents, sometimes making it difficult for members of the public to locate the catalogs on agency websites.
- Agencies provided the documents in many different formats, including .pdf, .doc, .xls, and tables embedded in html pages. This makes it more difficult for open data researchers to compare and compile catalogs.
- Agencies seemed to adopt divergent criteria for what systems should be included in the catalog. Some agencies only disclosed a handful, while other disclosed hundreds.
EFF Investigative Researcher Dave Maass testified at California Senate hearing on Oct. 7, 2016 on the results of the California Database Hunt. Watch here.
The full list of database catalog links are below. For a more robust spreadsheet, please see the links below.