OCLC Research and Europeana have been conducting innovation pilots since May and will be continuing through to December 2012. This collaborative initiative aims to pilot the use of existing and newly developed OCLC methods and techniques for cleansing and enriching large aggregations of metadata. Our objective is to identify and create semantic links between heterogeneous objects that are connected. Some examples that come to mind are: translated copies of the same publication, a painting and a photograph of that painting, different editions of one book or a collection of letters that belong to the same archive.
Europeana collects a steadily growing amount of metadata from European libraries, archives and museums. Aggregating metadata from these heterogeneous collections leads to quality issues such as duplication, uneven granularity of the object descriptions, ambiguity between original and derivative versions of the same object, etc.
OCLC Research has extensive experience and expertise in metadata quality improvement techniques and methods, such as duplicate detection and clustering of similar metadata records around FRBR-entity-relationships, reproductions and originals, different cataloguing languages. We are also experimenting with the automated enhancement of records with links to VIAF and other Linked Data elements. Our data quality improvement and enrichment efforts are part of our philosophy to “make the metadata work harder for libraries” and to enhance end-user experience.
OCLC’s collaboration with Europeana will be mutually beneficial. The outcomes of the research project will feed into the implementation of the Europeana Data Model (EDM), which is devised to improve the browsing experience of the visitors of the Europeana Portal.
The research findings will be published after completion of the pilots.