Linked Data Wikibase Prototype

OCLC Research Report

Creating Library Linked Data with Wikibase: Lessons Learned from Project Passage

Creating Library Linked Data with Wikibase: Lessons Learned from Project Passage

In this final report out for this project, participants provide an overview of the context in which the prototype was developed, how the Wikibase platform was adapted for use by librarians, and eight use cases where pilot participants (co-authors of this report) describe their experience of creating metadata for resources in various formats and languages using the Wikibase editing interface. They also share key issues, findings, reflections, and areas for future research.

In 2017 and 2018, OCLC partnered with 16 libraries in Project Passage  to demonstrate the impact of linked data for improving resource-description workflows.

•    American University
•    Brigham Young University
•    Cleveland Public Library
•    Cornell University Library
•    Harvard University
•    Michigan State University
•    National Library of Medicine
•    North Carolina State University
•    Northwestern University
•    Princeton University
•    Smithsonian Library
•    Temple University
•    UC Davis Library
•    University of Minnesota
•    University of New Hampshire
•    Yale University

The partners worked with OCLC to refine needs assessment for services. They also provided feedback by reflecting on their use of the prototype systems, responding to engagement activities, and participating in virtual meetings. This collaboration built upon past efforts, such as the Person Lookup Pilot and the Metadata Refinery effort, to demonstrate the production value of linked data services.

The pilot is now complete, but follow-up investigation continues in the CONTENTdm Linked Data pilot.

In the Passage project, the OCLC Research team worked closely with colleagues in OCLC's Global Product Management and Global Technologies to create metadata management environment built on the Wikibase platform.  Project Passage took advantage of all the functionality in the Wikibase system. Shown here, from left to right, are the key components: data import, Mediawiki functions including the user interfaces, and the RDF triplestore.

Figure 5: Data import with Pywikibot and the Passage Retriever

The result was a fully configurable environment for experimentation, with many features for editing, crowdsourcing, native multilingual support, and full support for linked data creation. These features are mostly hidden from human users so that metadata librarians could concentrate on the work they wanted to do, not the technical details of a linked data implementation.

The red arrows identify two functions that OCLC added in response to feedback from the Project Passage partners: a “Retriever” to import data from other sources, and an “Explorer” interface that enabled pilot participants to see the impact of the relationships they added as part of their workflow.

Methodology and Timeline

Partners were given access to a live prototype system. The features and functionality are fully documented and supported by a team of product managers, analysts, engineers, and architects. The goal of the project was to inform the Global Product Management roadmap for metadata applications and services.

November 2017: Project kickoff; Discuss partnership and services with Phase 1 Partners

December 2017: Gather use cases from Phase 1 Partners

January 2018: Reconcile strings to identifiers

February 2018: Launch entity editor

March 2018: Gather enhancements and provide SPARQL endpoint; add five to ten new library partners

April 2018: Discussions with libraries resulted in a total of 16 institutions participating in the project going forward

May 2018: Launch the experimental “Explorer” UI to view entities and their relationships to other items; launch the OpenRefine API; gather feedback on the creation of creative works and prioritization of enhancements from partners

Explorer view of Being and Time with multiple translations

June-July 2018: Implement top enhancements suggested by library partners

  1. Improve indexing 
  2. Use the Wikibase UI to search by a non-prototype identifier
  3. Include dates for disambiguation in autosuggest results
  4. Offer property-based constraints
  5. Provide gadget-based taxonomy navigation

August-September 2018: Explore additional top enhancements

  1. Provide a data import tool
  2. Include WorldCat data in the Explorer
  3. Offer an input form for descriptive data
  4. Batchload entities provided by partner libraries
  5. Document when reference sources are required for statements

Summary

The project achieved goals in three major areas.

  1. Collaboration: the team of OCLC staff and dozens of librarians from 16 institutions created use cases, created entities and made edits in the linked data ecosystem, used the OCLC Community Center to discuss workflows and ask questions, and participated in 28 monthly meetings and weekly “Office Hours” session.
  2. Reconciliation Services: experimented with cataloging workflows for entity reconciliation, using both a SPARQL endpoint and a user interfaced dubbed “The Explorer.
  3. Editing: managed entities in the native Wikibase user interface, the Explorer, and another experimental application, “The Retriever.” 

The simple prototype described at the beginning of the project matured overt time to a robust set of third-party tools and home-grown applications to manage over a million Wikidata entities. The evolution of the project to this more comprehensive set of tools and applications was driven by project participants’ new ideas, requested features, and feedback on applications and prototype use guidelines.

A recording of the final meeting with library partners is available online:

Related Presentations

See Full List of Presentations on Wikimedia

Team Lead

Andrew K. Pace
Executive Director, Technical Research
For more information, email pacea@oclc.org.

Project Team

John Chapman

Eric Childress

Jean Godby

Melissa Hess

Marti Heyman

Tod Matola

Jeff Mixter

Sara DeSmidt

Stephan Schindehette

Taylor Surface

Diane Vizine-Goetz

Bruce Washburn

Jeff Young