OCLC Developer Network

Linked Data at OCLC

OCLC has been experimenting with linked data for several years. For example, the top-three levels of the Dewey Decimal Classification have been available as linked data for several years, with the full set released in June 2012.  The Virtual International Authorities File (VIAF) and Faceted Application of Subject Terminology (FAST) have also been published as linked data.

WorldCat Linked Data

In June 2012 OCLC dramatically increased its exposure of linked data resources by making WorldCat.org bibliographic metadata available in this form. In order to do this, Schema.org mark up & library extensions have been added to Worldcat.org for the entire cataloging collection (book and journal titles primarily; not journal articles added through third party providers).  This provides improved functionality for our harvest partners and those writing widgets that require structured data exposed within HTML pages (via RDFa 1.1)

Using the Schema.org ontology, extended by a proposed Library Ontology, as a basis for modeling WorldCat bibliographic data will enable fullest coverage and consumption of this data by search engines and other systems. More detailed information about the vocabulary used by WorldCat.org  linked data is available.

The Experimental Linked Data Release is made available by OCLC under the Open Data Commons Attribution License, with reference to the community norms of the library members of the OCLC cooperative who built WorldCat.

Please note that this is not a static release, but an initial experimental release that will be changed over time as we receive feedback from the library and wider web communities. In order to facilitate discussion and feedback we've created a public discussion group. Community members can join by logging into the Developer Network site, visiting the group page, and selecting the Join link. Additionally, it is possible to subscribe to the group's activity via RSS feed.

Known issues

We are aware that this initial experimental release is not necessarily where we hope to end up and we will be working to improve it incrementally. To begin with, here are some issues we have identified and for which we will be seeking solutions with community involvement where appropriate:

  • Using the library extension to schema.org for example, we can describe http://www.worldcat.org/oclc/122255454 as a "Book" with a "bookFormat" of "EBook", but since the latter is defined as an "enumeration", we can't as yet provide the level of detail that the library profession would expect. The same is true for the other "BookFormats" like schema:Paperback, schema:Hardcover as well as our extensions library:LargePrintBook, library:AudioBook, and library:BrailleBook.
  • The same problem occurs (and is possibly more acute) with the content vs. carrier extension. We can say that http://www.worldcat.org/oclc/68969384 is an AudioBook (content) that "hasCarrier" library:CD (carrier), but we can't describe the qualities of the latter.
  • These and similar areas deserve a richer model that has yet to be determined.
  • Due to internal transition issues we are initially publishing the data as a new collapsed section on the WorldCat.org full record page instead of decorating the HTML elements that are already there. Our expectation is that we will be able to do this differently in the near future.
  • In the British Library Data Model (BLDM), each MARC 260 field is treated as a separate PublicationEvent involving an agent, place, and time. In the Schema.org model these end up in a pile, which isn't ideal.
  • Another difference between this model and the BLDM is that theirs assumes the range on dct:subject is invariably a Concept. That's fine for MARC 650s, but it's not ideal for the other 6XX fields.

Demonstration Code

  • Schema.org markup extractor - This bookmarklet created by the Developer Network staff extracts Schema.org markup from a web page and uses it to send information to either Goodreads or a Patron Driven Acquisitions capable of receiving data via OpenURL.
  • DBPedia / Schema.org Linked Data Demo - This bookmarklet created by the Developer Network staff extracts Schema.org markup and author URIs from a web page and uses it to query VIAF for links to dbpedia. The app then retrieves data about the author from dbpedia and shows it to the user

Follow the OCLC Developer Network:

The OCLC Developer Network supports the use of OCLC Web Services—a set of tools and APIs that expose data and services for WorldCat and our member libraries and partner institutions or companies. learn more »

© 2010 OCLC Domestic and international trademarks and/or service marks of OCLC Online Computer Library Center, Inc. and its affiliates


Powered by Drupal, an open source content management system