A closer examination of 19th and early 20th century books

Andy Breeding /

During the ALA conference this year we saw an interesting presentation about the Booktraces project, which focuses on capturing artifactual usage details from 19th century and early 20th century books. Andrew Stauffer from the University of Virginia demonstrated compelling examples of notes and inscriptions that were found in library books held by the university, examples which shed light on the significance of these books to those that used them.

Professor Stauffer has encouraged crowdsourced contributions to the project web site in hopes of preserving these details before they are lost. With space at a premium in campus libraries, and with weeding efforts targeting low-use titles for discard or transfer to storage, such an effort seems timely indeed. Among the questions this effort raises, is the one of scarcity. How widely held are these titles?

From the standpoint of the Booktraces project, titles that are flagged as widely held are more likely to be found on a weeding list and thus more at risk of not being examined for useful marginalia. This assumes that libraries are using WorldCat holdings levels to inform their weeding efforts—something that we, of course, encourage. Our clients typically regard a title as widely held when they have at least 50 U.S. holdings, though often this threshold is set at 100 holdings or higher.

To satisfy our curiosity on this matter we looked at data from 120 client projects run over the past two years. All but 2 of them were academic libraries, ranging from community colleges up to ARL libraries, with the majority being mid-sized institutions. Within this sample population of libraries we gathered information on 658,224 unique titles published between 1800 and 1923. Here is a graph showing the distribution of these titles by how widely held they are in the U.S.

graph representing U.S. holdings in WorldCat

Nineteen percent of these titles are held by more than 50 U.S. libraries; seven percent are held by more than 100 U.S. libraries. The median value is 18 U.S. holdings.

This holdings level varies by decade of publication, with earlier works being less commonly held. The following set of boxplots shows the distribution of these titles by holdings level per decade.

graph showing the distribution of U.S. holdings by decade, from 1800-1923

The top and bottom of the blue boxes represent the 3rd and 1st quartiles respectively while the red line represents the median holdings level. The width of the blue bars is proportional to the number of titles for that decade. Note that the 1920s are a partial decade (1920-23).

Also notable about these 19th and early 20th century titles is the fact that 44% of them are digitized as public domain titles and made available in the Hathi Trust Digital Library. As a secure, accessible digital surrogate—a Hathi Trust version can signal to a library that it is safe to deaccession that title. In practice, however, our clients rarely use this criterion to identify safe withdrawal candidates, preferring instead to rely on physical copies nearby or held by consortial partners.

The chart below shows the number and breakdown of these titles by decade of publication and Hathi Trust status.

graph shows titles ahd Hathi Trust status by decade, from 1800-1923