Linked Data Wikibase Prototype
OCLC Research Report
In this final report out for this project, participants provide an overview of the context in which the prototype was developed, how the Wikibase platform was adapted for use by librarians, and eight use cases where pilot participants (co-authors of this report) describe their experience of creating metadata for resources in various formats and languages using the Wikibase editing interface. They also share key issues, findings, reflections, and areas for future research.
In 2017 and 2018, OCLC partnered with 16 libraries in Project Passage to demonstrate the impact of linked data for improving resource-description workflows.
• American University
• Brigham Young University
• Cleveland Public Library
• Cornell University Library
• Harvard University
• Michigan State University
• National Library of Medicine
• North Carolina State University
• Northwestern University
• Princeton University
• Smithsonian Library
• Temple University
• UC Davis Library
• University of Minnesota
• University of New Hampshire
• Yale University
The partners worked with OCLC to refine needs assessment for services. They also provided feedback by reflecting on their use of the prototype systems, responding to engagement activities, and participating in virtual meetings. This collaboration built upon past efforts, such as the Person Lookup Pilot and the Metadata Refinery effort, to demonstrate the production value of linked data services.
The pilot is now complete, but follow-up investigation continues in the CONTENTdm Linked Data pilot.
In the Passage project, the OCLC Research team worked closely with colleagues in OCLC's Global Product Management and Global Technologies to create metadata management environment built on the Wikibase platform. Project Passage took advantage of all the functionality in the Wikibase system. Shown here, from left to right, are the key components: data import, Mediawiki functions including the user interfaces, and the RDF triplestore.
The result was a fully configurable environment for experimentation, with many features for editing, crowdsourcing, native multilingual support, and full support for linked data creation. These features are mostly hidden from human users so that metadata librarians could concentrate on the work they wanted to do, not the technical details of a linked data implementation.
The red arrows identify two functions that OCLC added in response to feedback from the Project Passage partners: a “Retriever” to import data from other sources, and an “Explorer” interface that enabled pilot participants to see the impact of the relationships they added as part of their workflow.
Methodology and Timeline
Partners were given access to a live prototype system. The features and functionality are fully documented and supported by a team of product managers, analysts, engineers, and architects. The goal of the project was to inform the Global Product Management roadmap for metadata applications and services.
November 2017: Project kickoff; Discuss partnership and services with Phase 1 Partners
December 2017: Gather use cases from Phase 1 Partners
January 2018: Reconcile strings to identifiers
February 2018: Launch entity editor
March 2018: Gather enhancements and provide SPARQL endpoint; add five to ten new library partners
April 2018: Discussions with libraries resulted in a total of 16 institutions participating in the project going forward
May 2018: Launch the experimental “Explorer” UI to view entities and their relationships to other items; launch the OpenRefine API; gather feedback on the creation of creative works and prioritization of enhancements from partners
Explorer view of Being and Time with multiple translations
June-July 2018: Implement top enhancements suggested by library partners
- Improve indexing
- Use the Wikibase UI to search by a non-prototype identifier
- Include dates for disambiguation in autosuggest results
- Offer property-based constraints
- Provide gadget-based taxonomy navigation
August-September 2018: Explore additional top enhancements
- Provide a data import tool
- Include WorldCat data in the Explorer
- Offer an input form for descriptive data
- Batchload entities provided by partner libraries
- Document when reference sources are required for statements
The project achieved goals in three major areas.
- Collaboration: the team of OCLC staff and dozens of librarians from 16 institutions created use cases, created entities and made edits in the linked data ecosystem, used the OCLC Community Center to discuss workflows and ask questions, and participated in 28 monthly meetings and weekly “Office Hours” session.
- Reconciliation Services: experimented with cataloging workflows for entity reconciliation, using both a SPARQL endpoint and a user interfaced dubbed “The Explorer.
- Editing: managed entities in the native Wikibase user interface, the Explorer, and another experimental application, “The Retriever.”
The simple prototype described at the beginning of the project matured overt time to a robust set of third-party tools and home-grown applications to manage over a million Wikidata entities. The evolution of the project to this more comprehensive set of tools and applications was driven by project participants’ new ideas, requested features, and feedback on applications and prototype use guidelines.
A recording of the final meeting with library partners is available online:
This presentation highlights key lessons from OCLC Research’s Linked Data Wikibase Prototype (“Project Passage”), a 10-month pilot done in 2018 in collaboration with metadata specialists in 16 US libraries.
PowerPoint Slides (11MB)
File: video, 25 minutes Topics: Linked Data
This presentation discusses the work of catalogers who participated in OCLC's Project Passage in 2018. It develops the theme of identification of "the entities that matter" and concludes with a brief update on OCLC's post-Passage activities involving resource description in Wikibase.
File: pptx, 8.8MB Topics: Linked Data, Wikimedia
Tampa, Florida, USA
IIIF is an emerging standard for sharing digital structural metadata. OCLC is an active member of the IIIF community and has been working to integrate the standard in is services/products. This talk discusses the experimental IIIF work being done by OCLC Research to help test evolving IIIF standards and help integrate them into production services.
File: pptx, 49MB Topics: IIIF, Linked Data
Indianapolis, IN, USA
The CONTENTdm Linked Data pilot explores how to convert CONTENTdm data into linked data, how to curate the data in the Wikibase infrastructure, and how to use the data to improve end-user experiences in CONTENTdm. This presentation covers the background research that led to the development of the pilot, the plans for the 3 phases of the pilot, and some early feedback from one of the pilot participants.
File: pptx, 22MB Topics: Linked Data, IIIF
OCLC Research is participating in the IIIF Discovery Working Group's on-going effort to develop a "Change Discovery API". The Change Discovery API will provide the information needed to discover and subsequently make use of IIIF resources.
File: ppt, 68MB Topics: IIIF, Linked Data
Boston, Massachusetts, USA
Using the Wikibase Linked Data Prototype as an example, Pace will outline 5 simple steps for managing a complex project that will improve your chances for getting from an experiment to a production service.
File: pptx, 11MB Topics: Linked Data
Boston, Massachusetts, USA
OCLC’s Project Passage evaluated a federated instance of Wikibase as a platform for cataloging bibliographic entities. This presentation will focus on applications and workflows that were developed during the project to help speed and improve the cataloging user experience.
File: pptx, 2MB Topics: Linked Data, Data Science
Boston, MA (USA)
View highlights of some key lessons from the OCLC Research Linked Data Wikibase Prototype (“Project Passage”) regarding Wikidata’s multilingualism support.
File: pptx, 5MB Topics: Wikimedia, Linked Data