Connaway's DLF presentation on Data Mining in Library Collection Silos now available

Lynn Silipigni Connaway's DLF Fall Forum presentation, "Data Mining Library Collection Silos: An Opportunity for Cooperative Collection Management of Print and Electronic Books," is now available on the OCLC Research Web site. The presentation was given at Breakout Session 12, 11:00-12:00 a.m. Wednesday, November 19, at the Old Town Sheraton in Albuquerque, New Mexico (USA). OCLC Research scientists Edward T. O'Neill, Chandra Prabha, and Brian Lavoie co-wrote the paper.

The OCLC Online Computer Library Center WorldCat database is used to identify print books (p-books) that have an electronic book (e-book) edition and the libraries that hold these materials. An analysis of the bibliographic characteristics of and the geographic holdings for these materials provide empirical data for library decision-making.

Libraries are installing compact shelving, moving lesser-used and older collections to remote storage locations and, increasingly, are digitizing their materials. With digital collections come new challenges, such as usage and cost comparisons of print and electronic resources, digitization and preservation processes, organization, retrieval systems, services, and collection management. By analyzing collection data across institutions and within collections, library decision-makers are able to make collection decisions based on empirical data. An aggregated database of library holdings is required for such an analysis.

This research draws on the OCLC Online Computer Library Center WorldCat database, containing more than 50 million records. WorldCat has not only served as an aggregator of bibliographic data for thirty years, but also identifies almost a billion holding locations for library resources. WorldCat can be used to describe collections bibliographically, as well as geographically.  The researchers use WorldCat to identify paper books (p-books) that have an electronic book (e-book) edition. Holding patterns are analyzed by type of library, publisher, date, and subject areas (using the North American Title Count) for all p-books and e-books. A comparison of the characteristics of p-books and e-books document the development and growth of the transition from the paper library to the digital library. The findings from this research will not only increase our understanding of the current e-book/p-book scenario, but could also be useful in seeking outside funding for a range of library operational issues, such as, preservation and digitization of materials and cooperative and individual library collection development and management decisions

More Information

Data Mining Library Collection Silos presentation: (PPT:8.33MB/27slides)

DLF Fall Forum Conference Web site:

For more information:

Lynn Silipigni Connaway, Ph.D.
Consulting Research Scientist
OCLC Research

Bob Bolander
Communications & Programs Manager
OCLC Research