New Publication - Brian Lavoie and Lorcan Dempsey on Characteristics of Potentially In-copyright Print Books in Library Collections

This article, in the November/December issue of D-Lib Magazine, provides empirical context for the many discussions surrounding the digitization of in-copyright print books by examining the characteristics of US-published print books in the OCLC WorldCat database. Emphasis is placed on books that are likely in copyright according to US copyright law.

The analysis in this article is based on data from the WorldCat database, which represents the aggregated collections of more than 70,000 libraries worldwide. It focuses on three areas: the WorldCat aggregate collection of US-published print books; the subset of this collection published during or after 1923 – i.e., those potentially associated with copyright and/or orphan works issues – and the combined print book collection of three academic research library participants in Google Books.

Findings indicate that the collection of US-published print books in WorldCat is quite large, encompassing about 15.5 million print books. Nearly two-thirds of these – those published after 1963 – have a high likelihood of being in copyright, while less than 15 percent – those published prior to 1923 – are almost certainly in the public domain, with the rest – those published between 1923 and 1963 – potentially in copyright if copyright was renewed. The post-1923 materials collectively account for more than 80 percent, or about 12.6 million, of the US-published print books in WorldCat.

Analysis of the post-1923 print books in WorldCat suggests significant limitations to automated assessment of copyright status using bibliographic data. Manual intervention will almost certainly be required in many cases, underscoring the importance of finding ways to reduce costs – for example by sharing the results of copyright investigations to reduce duplicative effort.

Another important finding from the analysis is the prominence of academic institutions as both suppliers and consumers of mass digitization activities such as Google Books.

OCLC Research also published an article in 2005 aimed at illuminating issues surrounding Google's then-new plan to digitize the print book collections of five major research libraries (link below).

More Information

New article
Lavoie, Brian, and Lorcan Dempsey. 2009. "Beyond 1923: Characteristics of Potentially In-copyright Print Books in Library Collections." D-Lib Magazine, 15,11/12 (November/December).

Earlier article
Lavoie, Brian, Lynn Silipigni Connaway, and Lorcan Dempsey. 2005. "Anatomy of Aggregate Collections: The Example of Google Print for Libraries" D-Lib Magazine, 11,9 (November).

For more information:

Brian Lavoie
Research Scientist
OCLC Research

Bob Bolander
Senior Communications Officer
OCLC Research