Linked Data and Metadata Research hero graphic

Data Science & Metadata Research

To be discoverable by today’s online users, traditional library data must be transformed. OCLC Research analyzes bibliographic data to derive new meaning, insights, and services for use by library and information seekers. This work includes special projects, data science research, engagement with metadata communities, publications and presentations, and the creation of illustrative experimental applications.

Metadata Enrichment

To be discoverable by today’s users, traditional library data must be transformed to derive new meaning, insights, and services for library staff and users

Archive Grid

Created for experimental text mining, data analysis, and discovery applications, ArchiveGrid includes over seven million archival material descriptions, including MARC records from WorldCat and finding aids harvested from the web.

IIIF: Improving the Interoperability of Digital Materials

IIIF (the International Image Interoperability Framework) is an emerging set of standards for sharing structural metadata about digital materials. Focused around a set of five APIs, the IIIF standard provides ways to access, view, search, and share digital images, audio, and video.

 

Authorities & Identities

From traditional authority work to identity management, libraries are deeply engaged in the knowledge work that surrounds the establishment of authorized forms and the contextualization of the underlying entities (persons, organizations, works, concepts, places, and events). OCLC Research helps leverage this collective knowledge investment for re-use by the library community.

CONTENTdm Linked Data Pilot

Traditional models of item description have rendered libraries’ digital collections largely invisible on the internet, and thus, hidden from researchers. OCLC worked with libraries to leverage unique collections from CONTENTdm repositories to improve evaluation, description, discovery.

VIAF (Virtual International Authority File)

The VIAF (Virtual International Authority File) combines multiple name authority files into a single OCLC-hosted name authority service.

WorldCat Identities

WorldCat Identities has a summary page for every name in WorldCat (currently some 30 million names) including named persons, organizations and fictitious characters.

 

Linked Data

Linked Data is about communities agreeing on the meaning of their data and sharing it in a massively networked information space. This vision is taking shape in many sectors, including e-commerce, medicine, scientific research, and government services. OCLC Research is a leader in driving this transformation in the library community.

CONTENTdm Linked Data Pilot

Traditional models of item description have rendered libraries’ digital collections largely invisible on the internet, and thus, hidden from researchers. OCLC worked with libraries to leverage unique collections from CONTENTdm repositories to improve evaluation, description, discovery.

FAST as Linked Data

FAST linked data is available as a service and as downloadable data sets. Access to FAST linked data enables metadata specialists and software developers to enhance resource descriptions with terminology and relationships associated with FAST concepts.

IIIF: Improving the Interoperability of Digital Materials

IIIF (the International Image Interoperability Framework) is an emerging set of standards for sharing structural metadata about digital materials. Focused around a set of five APIs, the IIIF standard provides ways to access, view, search, and share digital images, audio, and video.

Project Passage

In 2017 and 2018, OCLC partnered with 16 libraries in Project Passage to demonstrate the impact of linked data for improving resource-description workflows. This project used WikiBase and WikiData to create a fully configurable environment for experimentation for linked data creation, including editing, crowdsourcing, contextual description, and native multilingual support.

 

 

Subjects & Classification

The Subjects & Classification area explores the development, maintenance, and application of controlled vocabularies and classification schemes used in libraries, archives and museums.

Classify

Classify is a FRBR-based prototype designed to support the assignment of classification numbers and subject headings for books, DVDs, CDs, and other types of materials.

FAST (Faceted Application of Subject Terminology)

FAST is a vocabulary of controlled terms that can be used to describe the subject content of any kind of intellectual or creative work. The terms used by FAST are derived from the Library of Congress Subject Headings system.