Data Science & Metadata Research
To be discoverable by today’s online users, traditional library data must be transformed. OCLC Research analyzes bibliographic data to derive new meaning, insights, and services for use by library and information seekers. This work includes special projects in metadata enrichment, authorities & identities, linked data, subjects & classification, and data analysis.
Publications

Mining MARC's Hidden Treasures: Initial Investigations Into How Notes of the Past Might Shape Our Future
16 December 2016
Jay Weitz, Jenny Toves, Diane Vizine-goetz, Nannette Naught, Robert Bremer
Finding, interpreting, and manipulating the rich trove of data already present in MARC bibliographic records to produce systematized forms is an invaluable step in moving MARC toward a post-MARC, Linked Data future. Name access points, especially those fields in a controlled form, are the obvious place to find relationship information, but bibliographic notes and statements of responsibility are relatively overlooked sources of that information, waiting to be parsed and used. The Online Computer Library Center has been investigating means by which to find names and their associated role phrases, to match those names to authorized forms, and to match role terms and phrases to controlled vocabularies.

Undercounting File Downloads from Institutional Repositories
11 October 2016
Patrick Obrien, Kenning Arlitsch, Leila Sterman, Jeff Mixter, Jonathan Wheeler, Susan Borda

A Division of Labor: The Role of Schema.org in a Semantic Web Model of Library Resources
15 June 2016
Carol Jean Godby
This chapter describes some of OCLC’s experiments with Schema.org as the foundation for a linked data model of library resources.

Addressing the Challenges with Organizational Identifiers and ISNI
3 May 2016
Karen Smith-Yoshimura, Janifer Gatenby, Grace Agnew, +

Common Ground: Exploring Compatibilities Between the Linked Data Models of the Library of Congress and OCLC
3 December 2015
Carol Jean Godby, Ray Denenberg
Jointly released by OCLC and the Library of Congress, this white paper compares and contrasts the compatible linked data initiatives at both institutions. It is an executive summary of a more detailed technical analysis that will be released later this year.
The white paper summarizes the recent activity of the Bibliographic Framework Initiative at the Library of Congress which proposes a data model for future data interchange in the linked data environment that takes into account interactions with search engines and current developments in bibliographic description. It also provides an overview of OCLC’s efforts to refine the technical infrastructure and data architecture for at-scale publication of linked data for library resources in the broader Web. In addition, it investigates the promise of Schema.org as a common ground between the language of the information-seeking public and professional stewards of bibliographic description.

Library Linked Data in the Cloud: OCLC's Experiments with New Models of Resource Description
4 April 2015
Carol Jean Godby, Shenghui Wang, Jeffrey K. Mixter

Registering Researchers in Authority Files
27 October 2014
Karen Smith-Yoshimura, Micah Altman, Michael Conlon, Ana Lupe Cristán, Laura Dawson, Joanne Dunham, Thom Hickey, Daniel Hook, Wolfram Horstmann, rew MacEwan, Philip Schreur, Laura Smart, Melanie Wacker, Saskia Woutersen

Describing Theses and Dissertations Using Schema.org
11 October 2014
Jeffrey K. Mixter, Patrick O'Brien, Kenning Arlitsch

Where Should I Publish? Detecting Journal Similarity Based on What Has Been Published There
12 September 2014