Data Science

The internet is the native environment of information seekers. OCLC Research recognizes that to be integrated into the internet, traditional library data must be transformed in various ways. We are analyzing the data in WorldCat and other sources to derive new meaning, insights, and services for use by libraries and others on the internet. Our work includes:

Presentations

What are the entities that matter, and  how much should we say about them?

What are the entities that matter, and how much should we say about them?

By Jean Godby

NISO Webinar: Implementing Library Linked Data
Virtual

This presentation discusses the work of catalogers who participated in OCLC's Project Passage in 2018. It develops the theme of identification of "the entities that matter" and concludes with a brief update on OCLC's post-Passage activities involving resource description in Wikibase.

 

File: pptx, 8.8MB   Topics: Linked Data, Wikimedia

How IIIF standards improve search and discovery for Cultural Heritage collections

How IIIF standards improve search and discovery for Cultural Heritage collections

By Jeff Mixter

DLF Forum
Tampa, Florida, USA

IIIF is an emerging standard for sharing digital structural metadata. OCLC is an active member of the IIIF community and has been working to integrate the standard in is services/products. This talk discusses the experimental IIIF work being done by OCLC Research to help test evolving IIIF standards and help integrate them into production services.

File: pptx, 49MB   Topics: IIIF, Linked Data

Introducing the CONTENTdm Linked Data Pilot Project

Introducing the CONTENTdm Linked Data Pilot Project

By Jeff Mixter, Bruce Washburn

CONTENTdm User Group Meeting
Indianapolis, IN, USA

The CONTENTdm Linked Data pilot explores how to convert CONTENTdm data into linked data, how to curate the data in the Wikibase infrastructure, and how to use the data to improve end-user experiences in CONTENTdm. This presentation covers the background research that led to the development of the pilot, the plans for the 3 phases of the pilot, and some early feedback from one of the pilot participants.

 

File: pptx, 22MB   Topics: Linked Data, IIIF

IIIF Change Discovery in Action: Findings from an OCLC Research Experiment

IIIF Change Discovery in Action: Findings from an OCLC Research Experiment

By Jeff Mixter

IIIF Annual Conference
Göttingen, Germany

OCLC Research is participating in the IIIF Discovery Working Group's on-going effort to develop a "Change Discovery API". The Change Discovery API will provide the information needed to discover and subsequently make use of IIIF resources.

File: ppt, 68MB   Topics: IIIF, Linked Data

Fast and Discriminative Semantic Embedding

Fast and Discriminative Semantic Embedding

By Rob Koopman, Shenghui Wang, and Gwenn Englebienne

13th International Conference on Computational Semantics
Gothenburg, Sweden

We present a novel, effective and efficient method for term and document embedding method. Our experiments show it outperforms state-of-the-art methods in terms of the STS benchmark and subject prediction when trained on the same datasets, while at the same time being computationally cheaper by orders of magnitude.

 

File: pptx, 4MB   Topics: Semantic Embedding

An Innovative Approach to Scalable Semantic Embedding

An Innovative Approach to Scalable Semantic Embedding

By Shenghui Wang, Rob Koopman

AIDR 2019: Artificial Intelligence for Data Discovery and Reuse
Pittsburgh, Pennsylvania, USA

Semantic search, in addition to keyword based search, is a desirable feature for many digital library systems. Even in the largely structured library data world, there is still a lot of tacit information locked in the free-text fields. Embedding words and texts in compact, semantically meaningful vector spaces allows for computable semantic similarity/relatedness which would make search more intelligent.

File: pptx, 4MB   Topics: Semantic Embedding

Ideation to Prototype: Turning new ideas into useful services

Ideation to Prototype: Turning new ideas into useful services

By Andrew Pace

LD4 Conference on Linked Data in Libraries
Boston, Massachusetts, USA

Using the Wikibase Linked Data Prototype as an example, Pace will outline 5 simple steps for managing a complex project that will improve your chances for getting from an experiment to a production service.

File: pptx, 11MB   Topics: Linked Data

Taking Advantage of Multilingualism Support in Wikidata

Taking Advantage of Multilingualism Support in Wikidata

By Karen Smith-Yoshimura and Xiaioli Li

LD4 Conference on Linked Data in Libraries
Boston, MA (USA)

View highlights of some key lessons from the OCLC Research Linked Data Wikibase Prototype (“Project Passage”) regarding Wikidata’s multilingualism support.

File: pptx, 5MB   Topics: Wikimedia, Linked Data

Digging into the Research: An Overview of Models and Networks

Adoption and Use of IIIF for Digital Resource Sharing in CONTENTdm

By Shane Huddleston, Jeff Mixter

Best Practices Exchange 2019 Conference
Columbus, OH (USA)

Huddleston and Mixter provide an overview of IIIF Application Programming Interfaces (APIs), and how OCLC is using them across services, as well as our work in supporting standards with other organizations.

File: pptx, 16.8MB   Topics: IIIF