Data Science & Metadata Research
To be discoverable by today’s online users, traditional library data must be transformed. OCLC Research analyzes bibliographic data to derive new meaning, insights, and services for use by library and information seekers. This work includes special projects, data science research, engagement with metadata communities, publications and presentations, and the creation of illustrative experimental applications.
The Ukrainian Kyrylytsia, Restored: An Automation Project for Adding the Cyrillic Fields to Ukrainian Records in OCLC WorldCat
18 October 2021
Jenny Toves, Roman Tashlitskyy, Lana Soglasnova
A report on the work to add Cyrillic text to 30,000 Ukrainian records in Worldcat. This is part of a continuing effort that has added Cyrillic text to 1.1M Russian records and 25,000 Bulgarian records in Worldcat making the records more accessible to native speakers. The report discusses the obstacles involved when transliterating Latin text back to Cyrillic.
Transforming Metadata into Linked Data to Improve Digital Collection Discoverability: A CONTENTdm Pilot Project
21 January 2021
Greta Bahnemann, Michael Carroll, Paul Clough, Mario Einaudi, Chatham Ewing, Jeff Mixter, Jason Roy, Holly Tomren, Bruce Washburn, Elliot Williams
This report shares the CONTENTdm Linked Data Pilot project findings. In this pilot project, OCLC and five partner institutions investigated methods for—and the feasibility of—transforming metadata into linked data to improve the discoverability and management of digitized cultural materials.
21 July 2020
OCLC Research Archives and Special Collections Linked Data Review Group
This publication shares the findings from the Archives and Special Collections Linked Data Review Group, which explored key areas of concern and opportunities for archives and special collections in transitioning to a linked data environment.
9 July 2020
Andrew K. Pace
OCLC has been researching the use of linked data within libraries for more than a decade. It is sometimes difficult to know exactly where the value of linked data lies and what benefits we can derive from it. It is wise, therefore, to consider their usefulness from the point of view of library staff. What does "linked data productivity" mean? What would cataloging linked data change for library staff and end users? This article responds to these questions and provides some perspective on the linked data landscape for libraries.
Exploring Models for Shared Identity Management at a Global Scale: The Work of the PCC Task Group on Identity Management in NACO
9 December 2019
Erin Stalberg, John Riemer, Andrew MacEwan, Jennifer A. Liss, Violeta Ilik, Stephen Hearn, Jean Godby, Paul Frank, Michelle Durocher, Amber Billey
This paper discusses the efforts of the PCC Task Group on Identity Management in NACO to explore and advance identity management activities.
8 December 2019
Responsible Operations is intended to help chart library community engagement with data science, machine learning, and artificial intelligence (AI) and was developed in partnership with an advisory group and a landscape group comprised of more than 70 librarians and professionals from universities, libraries, museums, archives, and other organizations.
5 August 2019
Jean Godby, Karen Smith-Yoshimura, Bruce Washburn, Kalan Knudson Davis, Karen Detling, Christine Fernsebner Eslao, Steven Folsom, Xiaoli Li, Marc McGee, Karen Miller, Honor Moody, Craig Thomas, Holly Tomren
“Project Passage” is an OCLC Research Wikibase prototype that explores using linked data in library cataloging workflows. The report overviews the prototype’s development, its adaptation for library use, and eight librarians’ experiences with the editing interface to create metadata for resources.
8 November 2018
Using the 2018 International Linked Data Survey results, this article overviews the linked data projects or services implemented by institutions, what data they publish or consume, why they implemented linked data, challenges faced, and advice for institutions considering a linked data project or service.
21 October 2018
Hein van den Berg, Arianna Betti, Thom Castermans, Rob Koopman, Bettina Speckmann, Kevin Verbeek, Titia van der Werf, Shenghui Wang, Michel A. Westenberg
CatVis is an interdisciplinary digital humanities project that provides resources for librarians to manage vast bibliographic records as well as visualization tools for philosophical research. This paper describes the challenges encountered during the interdisciplinary research project CatVis.
13 August 2018
Thom Castermans, Kevin Verbeek, Bettina Speckmann, Michel A. Westenberg, Rob Koopman, Shenghui Wang, Hein van den Berg, Arianna Betti
This research proposes a novel type of low distortion radial embedding that preserves near-exact distances to the focus entity and minimizes distortion between other entities. This data visualization method adapts SolarView to explore high-dimensional metric space of bibliographic entity similarities.