On Ireland, library data, and humanities research

Brian Lavoie


St. Patrick’s Day is nearly upon us, and our thoughts turn to Ireland and the Irish …

… and to the new OCLC Research report, An Exploration of the Irish Presence in the Published Record, in which we use library data to identify and explore materials by Irish authors, about Ireland, and/or published in Ireland. In this report, we map out the features of the Irish landscape in WorldCat, including the most popular Irish author, as measured by library holdings (Jonathan Swift); the most popular work by an Irish author (Gulliver’s Travels); and the most translated Irish author (Oscar Wilde). Did you know that Northern Ireland-born Eve Bunting is the most popular Irish author in 29 US states? Or that toddler favorite Guess How Much I Love You is the 13th most popular work by an Irish author (Sam McBratney)?

In addition to the light it sheds on the Irish presence in the published record, our report also underscores the crucial role library data plays in cultivating a deep understanding of the publications collected, stewarded, and described in library collections. The traditional library mission of collecting the new while preserving the old means that library collections, taken collectively, provide a good approximation of the published record as it unfolds over time.

Library data enables a distant reading of the published record

Digital humanities scholars talk about distant reading—analyzing huge aggregations of digitized text, alongside the traditional practice of close reading—that is, reading individual books. Analysis of huge aggregations of library data—metadata about the publications in library collections—is another form of distant reading, allowing us to take a step back and explore vast collections of material, ranging from Irish-related publications, to world literature, to even the entire published record.

Library data is especially useful for humanities research because libraries have deep historical collections that reveal publishing patterns over time, which in turn provide insight into evolving interest in particular authors or works.

In our study, we found interesting variations in publishing patterns relating to a number of Irish authors. For example, the works of LT Meade (pen name for Elizabeth Thomasina Meade Smith)—a prolific author in the late 19th/early 20th century and, according to one commentator, the JK Rowling of her day—have since ebbed considerably in popularity. The Irish Gothic novelist Sheridan Le Fanu was active in the late 19th century, saw interest in his work decline, but then was “re-discovered” in the late 20th century. Bram Stoker, author of the classic horror tale Dracula, suddenly saw his popularity spike in the latter part of the 20th century, following—probably not coincidentally—the release of several big-budget films based on his work.

Library data shows not just the what, but the where

One of the really unique aspects of library data is the combination of bibliographic data (what libraries have collected) and holdings data (who has collected it). This gives us insight not only into the nature of the publications we are studying—say, the 1.6 million distinct publications constituting the Irish presence in the published record—but also the patterns by which these publications have diffused around the world.

Based on WorldCat holdings data, we know, for example, that Germany is the non-English speaking country with the largest concentration of Irish-related materials. And we can look within countries as well: that is how we know, as mentioned above, that Eve Bunting is the most popular Irish author in 29 US states. Or that Oliver Goldsmith is the most popular Irish author in only one state (Delaware).

In short, library data can be utilized to show concentrations of interest around the world in various aspects of the published record, including particular works, authors, or subjects.

Library data can fuel humanities research

Library data is a remarkably fertile ground for humanities research. Our report on Irish-related publications hopefully provides some indication of the kinds of interesting analysis that can be done with library data. Key to this is data at scale: the more data that is available about library collections, the richer the potential for exploring the contours and features of the published record. With more than 400 million bibliographic records and more than 2.6 billion holdings, WorldCat is the closest approximation of the published record available in a single data source.

Our report on the Irish presence in the published record follows earlier reports focusing on Scotland and New Zealand. Check out Lorcan Dempsey’s article in the Irish Times discussing some of our findings from the Irish report. You may also be interested in recent OCLC Next posts marking Jonathan Swift’s 350th birthday, and Polish contributions in the published record (inspired by the 2017 IFLA gathering in Wrocław, Poland).