December 3, 2009 | By Fred von Lohmann

Google Books Settlement 2.0: Evaluating Censorship

This is the fifth in a series of posts about the proposed Google Book Search settlement.

As we've explained in earlier posts, when it comes to evaluating the proposed Google Books settlement, the principal potential benefit to the public (increased access to books online) must be weighed against the potential drawbacks (impediments to competition, inadequate protection for privacy). Another potential downside for the public in the proposed settlement is the risk of censorship.

To understand the importance of this risk, keep two things in mind. First, while bookstores are entitled to pick and choose their inventory, Google Books hopes to be much more than a simple bookstore. In the words of Google's CEO Eric Schmidt: "Imagine one giant electronic card catalog that makes all the world's books discoverable with just a few keystrokes by anyone, anywhere, anytime." In other words, Google Books will have many characteristics that we associate more with the research libraries from which its books are drawn than with traditional bookstores. Second, as Prof. Geoffrey Nunberg reminds us: "This is almost certainly the Last Library, after all. There's no Moore's Law for capture, and nobody is ever going to scan most of these books again."

If Google's scans under the proposed settlement are likely to be the only chance millions of books will have for a digital life, then the potential for censorship is something to be taken very seriously indeed. If the books can't be found by researchers, it will be as though they were cast down the Memory Hole.

Censorship by Rightsholders

The biggest censorship risk created by the proposed settlement is from copyright owners. The proposed settlement gives rightsholders (until April 2011) the power to "Remove" their books from the Google Books corpus altogether. Once a book is removed, not only won't you be able to read it online, you won't even be able to find it using full-text search. In short, these books would simply cease to exist as far as users of Google Books are concerned, despite the fact that courts have ruled that indexing copyrighted works is a perfectly legal fair use. Moreover, even the libraries who contributed the book for scanning wouldn't have a digital "backup" in their collections, as these removed books would also vanish from the digital copies that Google gives back to the research libraries (the "Library Digital Copies" and the "Research Corpus," in the lingo of the settlement agreement).

Why would a rightsholder want to self-censor? First, remember that the author of a book is often not the rightsholder. As a result, the copyright in a book can be purchased and then used to suppress further publication (a trick Howard Hughes tried). Moreover, sometimes the author or author's heir (or corporate successor) wants to suppress a work (Prof. R. Anthony Reese describes a number of historical examples of post-publication suppression efforts by authors and rightsholders in this article).

In the world of research libraries, of course, this kind of censorship is impossible—no research library would pull cards from the catalog and destroy copies of published works at the behest of those who own the copyright in those books. Yet this is exactly what the proposed settlement would permit for the "Last Library." And most galling is that the settlement does not even require that a complete list of these "Removed" books ever be made publicly available (in Google's web search, in contrast, Google includes entries for results that would have appeared, but for DMCA takedown demands, and makes those demands publicly available through Chilling Effects).

At a minimum, books that are "removed" should remain in the database for full-text search, and Google should remain able to offer a "Library Link" (i.e., a link that directs a researcher to a library where the book can be found).

Even more troubling is the possibility of selective alterations of the texts of the books themselves. In Section 3.10(c)(i), the settlement forbids Google "except as expressly authorized by the Registered Rightsholder" from altering the text of scanned books when displayed to users. That's certainly a good thing, as far as it goes—we shouldn't want Google to be able to go in and selectively edit books. But Google is allowed to selectively edit if "authorized" by the copyright owner. Why is this permitted? And if the rightsholder "authorizes" Google to make changes, can Google refuse to do so? Will the fact of alteration be publicly visible to the reader? The answer is not clear. But clearly the better rule is a prohibition on anyone making editorial alterations in the text of scanned books (again, no library would allow a copyright owner to selectively blackline books in the stacks). Any other option creates the chilling prospect of "revising history" as imagined in Orwell's 1984.

Censorship by Google

The proposed settlement also gives Google a troubling degree of discretion when it comes to choosing which books will be publicly accessible. For example, Section 3.7(e) makes it clear that Google can exclude any scanned book it likes from public access "for editorial or non-editorial reasons." If it excludes a book for "editorial reasons," it must notify the Registry (but not the public), and the Registry may look for an alternative partner ("Third-Party Required Library Service Provider") to host the book. There is nothing that requires the Registry to do so, nor any guarantee that such a partner will step forward.

In addition, in order to meet its obligations under Section 7.2(e) of the proposed settlement, Google need only make 85% of the books it scans from its library partners publicly accessible through full-text search, consumer purchase, or the institutional subscription database. Assuming that Google has already scanned approximately 8 million books that are in-copyright, that means Google can make more than 1.2 million of these books disappear from its publicly accessible services for any reason and still meet its obligations under the settlement. And, again, nothing in the settlement requires Google to make the list of omitted books available to the public.

Censorship by Government

Finally, it's worth noting that governments will doubtless exploit the leeway that the settlement gives to both rightsholders and Google to pull books off the digital shelves of Google Books. It's all too easy to imagine foreign governments pressuring their citizens to "remove" books from public access on Google. It's also likely that foreign governments will pressure Google to omit books from Google Books. If that comes to pass, neither Google nor the rightsholders will be able to say that they are legally constrained by the settlement from complying short of legal process. Had the settlement agreement been written to forbid this kind of censorship, both rightsholders and Google could have responded to censorship demands by saying "come back with a court order."

And, finally, remember that Google may, under the settlement, sell off the entire Google Books project. So even if you believe that Google would never cave to foreign governments or engage in selective censorship, keep in mind that 10 years from now, Google Books might be owned by an entirely different corporate master.


Deeplinks Topics

Stay in Touch

NSA Spying

EFF is leading the fight against the NSA's illegal mass surveillance program. Learn more about what the program is, how it works, and what you can do.

Follow EFF

An important PSA from our Director of Copyright Activism.

Sep 1 @ 2:51pm

How many times has @EFF tweeted? It's over 9000! Thanks for following.

Sep 1 @ 2:20pm

"Internet drink mixer vs. everyone." Ars Technica covers our selection for Stupid Patent of the Month: https://eff.org/r.n8k5

Sep 1 @ 1:15pm
JavaScript license information