More than half of the 300 million bibliographic records in WorldCat, the world's largest network of library content and services, represent resources in languages other than English. These records are clustered together in worksets, which may include multiple bibliographic records for the same title with data elements represented in different languages of cataloging, that is, the language of the metadata used to describe the resource. This information is supplied by catalogers and not transcribed from the resource, such as notes and subject headings.
In order to leverage the multilingual content in WorldCat and make it easier for users to identify resources in their preferred language and script, OCLC Research has launched the multilingual bibliographic structure activity to mine the data from translated works, with the goal of improving work clustering, presentation, linked data representations and to contribute generally to global knowledge. We’re also generating work-translation ("expression level") records—including the translated title and translator with links to the original work and the author—and adding them to VIAF (Virtual International Authority File), flagged as "xR". At the same time, we’re marking up these generated VIAF records using linked data schema so that the relationship of each work with their associated translations and translators can be shared in the Semantic Web.
Identifying the records representing translations will enable presenting a work in the user's preferred language, where available. This work will also enable us to gain a better understanding of the extent information is shared across cultures, e.g., the percentage of non-English works representing translations of English works, and vice-versa.
See the multilingual bibliographic structure activity page or OCLC Research Program Officer Karen Smith-Yoshimura's Challenges posed by translations hangingtogether.org blog post for more details about this work.
For more information: