Linked Data Wikibase Prototype

OCLC Research Report

Creating Library Linked Data with Wikibase: Lessons Learned from Project Passage

Creating Library Linked Data with Wikibase: Lessons Learned from Project Passage

In this final report out for this project, participants provide an overview of the context in which the prototype was developed, how the Wikibase platform was adapted for use by librarians, and eight use cases where pilot participants (co-authors of this report) describe their experience of creating metadata for resources in various formats and languages using the Wikibase editing interface. They also share key issues, findings, reflections, and areas for future research.

In 2017 and 2018, OCLC partnered with 16 libraries in Project Passage  to demonstrate the impact of linked data for improving resource-description workflows.

•    American University
•    Brigham Young University
•    Cleveland Public Library
•    Cornell University Library
•    Harvard University
•    Michigan State University
•    National Library of Medicine
•    North Carolina State University
•    Northwestern University
•    Princeton University
•    Smithsonian Library
•    Temple University
•    UC Davis Library
•    University of Minnesota
•    University of New Hampshire
•    Yale University

The partners worked with OCLC to refine needs assessment for services. They also provided feedback by reflecting on their use of the prototype systems, responding to engagement activities, and participating in virtual meetings. This collaboration built upon past efforts, such as the Person Lookup Pilot and the Metadata Refinery effort, to demonstrate the production value of linked data services.

The pilot is now complete, but follow-up investigation continues in the CONTENTdm Linked Data pilot.

In the Passage project, the OCLC Research team worked closely with colleagues in OCLC's Global Product Management and Global Technologies to create metadata management environment built on the Wikibase platform.  Project Passage took advantage of all the functionality in the Wikibase system. Shown here, from left to right, are the key components: data import, Mediawiki functions including the user interfaces, and the RDF triplestore.

Figure 5: Data import with Pywikibot and the Passage Retriever

The result was a fully configurable environment for experimentation, with many features for editing, crowdsourcing, native multilingual support, and full support for linked data creation. These features are mostly hidden from human users so that metadata librarians could concentrate on the work they wanted to do, not the technical details of a linked data implementation.

The red arrows identify two functions that OCLC added in response to feedback from the Project Passage partners: a “Retriever” to import data from other sources, and an “Explorer” interface that enabled pilot participants to see the impact of the relationships they added as part of their workflow.

Methodology and Timeline

Partners were given access to a live prototype system. The features and functionality are fully documented and supported by a team of product managers, analysts, engineers, and architects. The goal of the project was to inform the Global Product Management roadmap for metadata applications and services.

November 2017: Project kickoff; Discuss partnership and services with Phase 1 Partners

December 2017: Gather use cases from Phase 1 Partners

January 2018: Reconcile strings to identifiers

February 2018: Launch entity editor

March 2018: Gather enhancements and provide SPARQL endpoint; add five to ten new library partners

April 2018: Discussions with libraries resulted in a total of 16 institutions participating in the project going forward

May 2018: Launch the experimental “Explorer” UI to view entities and their relationships to other items; launch the OpenRefine API; gather feedback on the creation of creative works and prioritization of enhancements from partners

Explorer view of Being and Time with multiple translations

June-July 2018: Implement top enhancements suggested by library partners

  1. Improve indexing 
  2. Use the Wikibase UI to search by a non-prototype identifier
  3. Include dates for disambiguation in autosuggest results
  4. Offer property-based constraints
  5. Provide gadget-based taxonomy navigation

August-September 2018: Explore additional top enhancements

  1. Provide a data import tool
  2. Include WorldCat data in the Explorer
  3. Offer an input form for descriptive data
  4. Batchload entities provided by partner libraries
  5. Document when reference sources are required for statements

Summary

The project achieved goals in three major areas.

  1. Collaboration: the team of OCLC staff and dozens of librarians from 16 institutions created use cases, created entities and made edits in the linked data ecosystem, used the OCLC Community Center to discuss workflows and ask questions, and participated in 28 monthly meetings and weekly “Office Hours” session.
  2. Reconciliation Services: experimented with cataloging workflows for entity reconciliation, using both a SPARQL endpoint and a user interfaced dubbed “The Explorer.
  3. Editing: managed entities in the native Wikibase user interface, the Explorer, and another experimental application, “The Retriever.” 

The simple prototype described at the beginning of the project matured overt time to a robust set of third-party tools and home-grown applications to manage over a million Wikidata entities. The evolution of the project to this more comprehensive set of tools and applications was driven by project participants’ new ideas, requested features, and feedback on applications and prototype use guidelines.

A recording of the final meeting with library partners is available online:

Related Presentations

This presentation highlights key lessons from OCLC Research’s Linked Data Wikibase Prototype (“Project Passage”), a 10-month pilot done in 2018 in collaboration with metadata specialists in 16 U.S. libraries.

Lessons from Representing Library Metadata in OCLC Research’s Linked Data Wikibase Prototype (video)

By Karen Smith-Yoshimura

Semantic Web in Libraries (SWIB) 2019
Hamburg, Germany

This presentation highlights key lessons from OCLC Research’s Linked Data Wikibase Prototype (“Project Passage”), a 10-month pilot done in 2018 in collaboration with metadata specialists in 16 US libraries.

Additional Materials:
PowerPoint Slides (11MB)


File: video, 25 minutes   Topics: Linked Data

What are the entities that matter, and  how much should we say about them?

What are the entities that matter, and how much should we say about them?

By Jean Godby

NISO Webinar: Implementing Library Linked Data
Virtual

This presentation discusses the work of catalogers who participated in OCLC's Project Passage in 2018. It develops the theme of identification of "the entities that matter" and concludes with a brief update on OCLC's post-Passage activities involving resource description in Wikibase.

 

File: pptx, 8.8MB   Topics: Linked Data, Wikimedia

How IIIF standards improve search and discovery for Cultural Heritage collections

How IIIF standards improve search and discovery for Cultural Heritage collections

By Jeff Mixter

DLF Forum
Tampa, Florida, USA

IIIF is an emerging standard for sharing digital structural metadata. OCLC is an active member of the IIIF community and has been working to integrate the standard in is services/products. This talk discusses the experimental IIIF work being done by OCLC Research to help test evolving IIIF standards and help integrate them into production services.

File: pptx, 49MB   Topics: IIIF, Linked Data

Introducing the CONTENTdm Linked Data Pilot Project

Introducing the CONTENTdm Linked Data Pilot Project

By Jeff Mixter, Bruce Washburn

CONTENTdm User Group Meeting
Indianapolis, IN, USA

The CONTENTdm Linked Data pilot explores how to convert CONTENTdm data into linked data, how to curate the data in the Wikibase infrastructure, and how to use the data to improve end-user experiences in CONTENTdm. This presentation covers the background research that led to the development of the pilot, the plans for the 3 phases of the pilot, and some early feedback from one of the pilot participants.

 

File: pptx, 22MB   Topics: Linked Data, IIIF

IIIF Change Discovery in Action: Findings from an OCLC Research Experiment

IIIF Change Discovery in Action: Findings from an OCLC Research Experiment

By Jeff Mixter

IIIF Annual Conference
Göttingen, Germany

OCLC Research is participating in the IIIF Discovery Working Group's on-going effort to develop a "Change Discovery API". The Change Discovery API will provide the information needed to discover and subsequently make use of IIIF resources.

File: ppt, 68MB   Topics: IIIF, Linked Data

Ideation to Prototype: Turning new ideas into useful services

Ideation to Prototype: Turning new ideas into useful services

By Andrew Pace

LD4 Conference on Linked Data in Libraries
Boston, Massachusetts, USA

Using the Wikibase Linked Data Prototype as an example, Pace will outline 5 simple steps for managing a complex project that will improve your chances for getting from an experiment to a production service.

File: pptx, 11MB   Topics: Linked Data

OCLC Project Passage User Interface: Assisting the cataloging workflow

OCLC Project Passage User Interface: Assisting the cataloging workflow

By Bruce Washburn

LD4 Conference on Linked Data in Libraries
Boston, Massachusetts, USA

OCLC’s Project Passage evaluated a federated instance of Wikibase as a platform for cataloging bibliographic entities.  This presentation will focus on applications and workflows that were developed during the project to help speed and improve the cataloging user experience.

File: pptx, 2MB   Topics: Linked Data, Data Science

Taking Advantage of Multilingualism Support in Wikidata

Taking Advantage of Multilingualism Support in Wikidata

By Karen Smith-Yoshimura and Xiaioli Li

LD4 Conference on Linked Data in Libraries
Boston, MA (USA)

View highlights of some key lessons from the OCLC Research Linked Data Wikibase Prototype (“Project Passage”) regarding Wikidata’s multilingualism support.

File: pptx, 5MB   Topics: Wikimedia, Linked Data

See Full List of Presentations on Wikimedia

Team Lead

Andrew K. Pace
Executive Director, Technical Research
For more information, email pacea@oclc.org.

Project Team

John Chapman

Eric Childress

Jean Godby

Melissa Hess

Marti Heyman

Tod Matola

Jeff Mixter

Sara DeSmidt

Stephan Schindehette

Taylor Surface

Diane Vizine-Goetz

Bruce Washburn

Jeff Young