There is mounting concern in some quarters that the Google Book Search settlement (see previous posts here, here, and here) could have anticompetitive effects. Everyone (including Google) seems to agree that, all else being equal, we shouldn't want a world where Google is the only entity that is scanning and providing online access to books, particularly the majority of out-of-print books whose owners can't be found (i.e., "orphan works").

So what would be necessary to create a marketplace with an opportunity for real competition? Obviously, entities other than Google will have to be able to get the same kind of blanket copyright license on comparable terms. Unfortunately, the proposed settlement makes Google the only company that can get a blanket license that covers orphan works — that issue has received considerable attention.

But those who are worried about market entry and long-term competition in this arena should also be thinking about another thing competitors need: access to the scans themselves.

The raw scans themselves should not be subject to copyright protection. But if Google hoards the scans, preventing bulk copying (with either legal or technical measures), then competitors will be forced to spend millions to re-scan the very same books in order to compete with Google. This not only is a barrier to entry, but also entails enormous long-term social waste — do we really want a world where every book needs to be re-scanned, over and over, by anyone who wants to enter this market?

Of course, Google (and anyone else) who wants to undertake the expensive task of scanning books should be entitled to some opportunity to recoup their costs. But it's hard to see why that should translate into an eternal barrier to entry for others. It makes no sense to require the 5th or 10th or 99th newcomer to spend millions to scan books that have already been scanned multiple times.

One good compromise might be to require that anyone who takes a blanket license (whether under the Google Book Search settlement, or under any legislation that might expand the settlement to others) must deposit a copy of the raw scans that they create with the Library of Congress or with the entity that administers the blanket license (e.g., the Books Rights Registry). After a period of years, let's say 14, the term of the Founder's Copyright, those scans should be made available at no cost to any others who take the relevant copyright licenses.

This would not only encourage market entry and competition in the online digital books arena, but would also foster innovation in the field. There's nothing that encourages digital innovation quite like access to an enormous dataset. After all, before Larry Page and Sergey Brin founded Google, they were graduate students at Stanford. They were able to build a new search engine by downloading their own copy of the web, messing around with it, and figuring our a better algorithm for querying it. New start-ups working with digital books should have the same kind of opportunity.

It makes no economic sense for us to force every future pair of graduate students who want to experiment with the book dataset to spend those hundreds of millions of dollars before they can launch their new startup. On the other hand, Google deserves some fair reward for navigating the obstacles and getting the books scanned. A compromise like a 14-year escrow rule might be just the way to achieve that.