Europeana Innovation Pilots
OCLC Research and Europeana are conducting innovation pilots from May through December 2012. This collaborative initiative aims to pilot the use of existing and newly developed OCLC methods and techniques for cleansing and enriching large aggregations of metadata. Our objective is to identify and create semantic links between heterogeneous objects that are connected. Some examples that come to mind are: translated copies of the same publication, a painting and a photograph of that painting, different editions of one book or a collection of letters that belong to the same archive.
Background
Europeana collects a steadily growing amount of metadata from European libraries, archives and museums. Aggregating metadata from these heterogeneous collections leads to quality issues such as duplication, uneven granularity of the object descriptions, ambiguity between original and derivative versions of the same object, etc.
OCLC Research has extensive experience and expertise in metadata quality improvement techniques and methods, such as duplicate detection and clustering of similar metadata records around FRBR-entity-relationships, reproductions and originals, different cataloguing languages. We are also experimenting with the automated enhancement of records with links to VIAF and other Linked Data elements. Our data quality improvement and enrichment efforts are part of our philosophy to “make the metadata work harder for libraries” and to enhance end-user experience.
Impact
Our collaboration with Europeana will be mutually beneficial. The outcomes of the research project will feed into the implementation of the Europeana Data Model (EDM), which is devised to improve the browsing experience of the visitors of the Europeana Portal. In addition, the piloting of its data clustering and enrichment methods and techniques will inform follow-up activities in more innovative directions and opportunities to develop new data services for third parties.
Outputs
The research findings will be published after completion of the pilots. They will include results of the investigations into the feasibility of linking the Europeana records to OCLC linked data (such as VIAF), detecting (near-) duplicates and categorizing clusters of similar objects.
Interim results will be published on the OCLC Research and Europeana Pro web pages.
Presentations
- van der Werf, Titia. "Metadata Out of Control: Network-level Metadata Aggregations." Presented at EMEARC, 26 February 2013, Strasbourg, France.
Download the presentation (.pptx: 6.6MB/29 slides)
View on SlideShare
- Wang, Shenghui. 2013. "Hunting for Semantic Clusters: How Can We Find Interesting Stuff in Over 22 Million Europeana Objects?" Presented at EMEARC, 26 February 2013, Strasbourg, France.
View on Prezi
Most recent updates: Page content: 2013-04-12
This activity is a part of the Metadata Management theme.