Data Science

The internet is the native environment of information seekers. OCLC Research recognizes that to be integrated into the internet, traditional library data must be transformed in various ways. We are analyzing the data in WorldCat and other sources to derive new meaning, insights, and services for use by libraries and others on the internet. Our work includes:

Presentations

OCLC Linked Data: Research, experimental applications, and shared infrastructure

OCLC Linked Data: Research, experimental applications, and shared infrastructure

By Andrew Pace, John Chapman

LD4 Conference 2020
virtual

This presentation summarized OCLC's findings on the impact of new workflows in the ground-shifting transition from traditional cataloging to linked data platforms, highlighted the integral engagement, participation, and feedback from OCLC members, and attempted to chart a linked data research path for the decade to come.

Recording available on LD4 on YouTube.

File: pdf, 2.5MB   Topics: Linked Data

This presentation highlights key lessons from OCLC Research’s Linked Data Wikibase Prototype (“Project Passage”), a 10-month pilot done in 2018 in collaboration with metadata specialists in 16 U.S. libraries.

Lessons from Representing Library Metadata in OCLC Research’s Linked Data Wikibase Prototype (video)

By Karen Smith-Yoshimura

Semantic Web in Libraries (SWIB) 2019
Hamburg, Germany

This presentation highlights key lessons from OCLC Research’s Linked Data Wikibase Prototype (“Project Passage”), a 10-month pilot done in 2018 in collaboration with metadata specialists in 16 US libraries.

Additional Materials:
PowerPoint Slides (11MB)


File: video, 25 minutes   Topics: Linked Data

What are the entities that matter, and  how much should we say about them?

What are the entities that matter, and how much should we say about them?

By Jean Godby

NISO Webinar: Implementing Library Linked Data
Virtual

This presentation discusses the work of catalogers who participated in OCLC's Project Passage in 2018. It develops the theme of identification of "the entities that matter" and concludes with a brief update on OCLC's post-Passage activities involving resource description in Wikibase.

 

File: pptx, 8.8MB   Topics: Linked Data, Wikimedia

How IIIF standards improve search and discovery for Cultural Heritage collections

How IIIF standards improve search and discovery for Cultural Heritage collections

By Jeff Mixter

DLF Forum
Tampa, Florida, USA

IIIF is an emerging standard for sharing digital structural metadata. OCLC is an active member of the IIIF community and has been working to integrate the standard in is services/products. This talk discusses the experimental IIIF work being done by OCLC Research to help test evolving IIIF standards and help integrate them into production services.

File: pptx, 49MB   Topics: IIIF, Linked Data

Introducing the CONTENTdm Linked Data Pilot Project

Introducing the CONTENTdm Linked Data Pilot Project

By Jeff Mixter, Bruce Washburn

CONTENTdm User Group Meeting
Indianapolis, IN, USA

The CONTENTdm Linked Data pilot explores how to convert CONTENTdm data into linked data, how to curate the data in the Wikibase infrastructure, and how to use the data to improve end-user experiences in CONTENTdm. This presentation covers the background research that led to the development of the pilot, the plans for the 3 phases of the pilot, and some early feedback from one of the pilot participants.

 

File: pptx, 22MB   Topics: Linked Data, IIIF

IIIF Change Discovery in Action: Findings from an OCLC Research Experiment

IIIF Change Discovery in Action: Findings from an OCLC Research Experiment

By Jeff Mixter

IIIF Annual Conference
Göttingen, Germany

OCLC Research is participating in the IIIF Discovery Working Group's on-going effort to develop a "Change Discovery API". The Change Discovery API will provide the information needed to discover and subsequently make use of IIIF resources.

File: ppt, 68MB   Topics: IIIF, Linked Data

Fast and Discriminative Semantic Embedding

Fast and Discriminative Semantic Embedding

By Rob Koopman, Shenghui Wang, and Gwenn Englebienne

13th International Conference on Computational Semantics
Gothenburg, Sweden

We present a novel, effective and efficient method for term and document embedding method. Our experiments show it outperforms state-of-the-art methods in terms of the STS benchmark and subject prediction when trained on the same datasets, while at the same time being computationally cheaper by orders of magnitude.

 

File: pptx, 4MB   Topics: Semantic Embedding

An Innovative Approach to Scalable Semantic Embedding

An Innovative Approach to Scalable Semantic Embedding

By Shenghui Wang, Rob Koopman

AIDR 2019: Artificial Intelligence for Data Discovery and Reuse
Pittsburgh, Pennsylvania, USA

Semantic search, in addition to keyword based search, is a desirable feature for many digital library systems. Even in the largely structured library data world, there is still a lot of tacit information locked in the free-text fields. Embedding words and texts in compact, semantically meaningful vector spaces allows for computable semantic similarity/relatedness which would make search more intelligent.

File: pptx, 4MB   Topics: Semantic Embedding

Ideation to Prototype: Turning new ideas into useful services

Ideation to Prototype: Turning new ideas into useful services

By Andrew Pace

LD4 Conference on Linked Data in Libraries
Boston, Massachusetts, USA

Using the Wikibase Linked Data Prototype as an example, Pace will outline 5 simple steps for managing a complex project that will improve your chances for getting from an experiment to a production service.

File: pptx, 11MB   Topics: Linked Data

Taking Advantage of Multilingualism Support in Wikidata

Taking Advantage of Multilingualism Support in Wikidata

By Karen Smith-Yoshimura and Xiaioli Li

LD4 Conference on Linked Data in Libraries
Boston, MA (USA)

View highlights of some key lessons from the OCLC Research Linked Data Wikibase Prototype (“Project Passage”) regarding Wikidata’s multilingualism support.

File: pptx, 5MB   Topics: Wikimedia, Linked Data