Metadata Management

  • Europeana Innovation Pilots

Europeana Innovation Pilots


OCLC Research and Europeana are conducting innovation pilots from May through December 2012. This collaborative initiative aims to pilot the use of existing and newly developed OCLC methods and techniques for cleansing and enriching large aggregations of metadata. Our objective is to identify and create semantic links between heterogeneous objects that are connected. Some examples that come to mind are: translated copies of the same publication, a painting and a photograph of that painting, different editions of one book or a collection of letters that belong to the same archive.

Background

Europeana collects a steadily growing amount of metadata from European libraries, archives and museums. Aggregating metadata from these heterogeneous collections leads to quality issues such as duplication, uneven granularity of the object descriptions, ambiguity between original and derivative versions of the same object, etc.

OCLC Research has extensive experience and expertise in metadata quality improvement techniques and methods, such as duplicate detection and clustering of similar metadata records around FRBR-entity-relationships, reproductions and originals, different cataloguing languages. We are also experimenting with the automated enhancement of records with links to VIAF and other Linked Data elements. Our data quality improvement and enrichment efforts are part of our philosophy to “make the metadata work harder for libraries” and to enhance end-user experience.

Impact

Our collaboration with Europeana will be mutually beneficial. The outcomes of the research project will feed into the implementation of the Europeana Data Model (EDM), which is devised to improve the browsing experience of the visitors of the Europeana Portal. In addition, the piloting of its data clustering and enrichment methods and techniques will inform follow-up activities in more innovative directions and opportunities to develop new data services for third parties.

Outputs

The research findings will be published after completion of the pilots. They will include results of the investigations into the feasibility of linking the Europeana records to OCLC linked data (such as VIAF), detecting (near-) duplicates and categorizing clusters of similar objects.

Interim results will be published on the OCLC Research and Europeana Pro web pages.

Presentations

  • Wang, Shenghui. 2013. "Hunting for Semantic Clusters: How Can We Find Interesting Stuff in Over 22 Million Europeana Objects?" Presented at EMEARC, 26 February 2013, Strasbourg, France.
    View on Prezi

Most recent updates: Page content: 2013-04-12

Lead

Titia van der Werf

Team Members

Shenghui Wang
Rob Koopman

This activity is a part of the Metadata Management theme.

We are a worldwide library cooperative, owned, governed and sustained by members since 1967. Our public purpose is a statement of commitment to each other—that we will work together to improve access to the information held in libraries around the globe, and find ways to reduce costs for libraries through collaboration.