OCLC production units and OCLC Research are involved in Linked Data-related research and standards activities and exploring Linked Data activities and applications. This activity page provides information about a variety of OCLC-related Linked Data-related activities including activities that OCLC Research is closely involved with or leading.


International Linked Data Survey

Linked Data Survey, Part 2 (2015) 

Linked Data Survey, Part 1 (2014)  


Linked Data is a term which describes an approach to exposing data in a machine-readable form where the data is "de-referenceable" (i.e. URIs are an integral part of the exposed data and external applications can use the URIs to perform various actions such as retrieving data, connecting same/similar/related data from multiple Linked Data stores).

This approach to exposing, sharing, and connecting data has become increasingly popular in recent years, and more and more agencies are publishing data which adheres to Linked Data principles as articulated by Sir Tim Berners-Lee:

Linked Data: Design Issues (Sir Tim Berners-Lee)

  1. Use URIs to identify things.
  2. Use HTTP URIs so that these things can be referred to and looked up ("dereferenced") by people and user agents.
  3. Provide useful information about the thing when its URI is dereferenced, using standard formats such as RDF/XML.
  4. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.

Linked Data is about communities agreeing on the semantics of their common data, adopting the naming patterns of other communities where their semantics agree and mapping/extending those vocabularies when necessary. For example, the library community has a dozen semantic distinctions for the word "title": Uniform Title, Spine Title, Running Title, etc., but they can probably all map to Dublin Core Title. This allows other communities to use a piece of data marked as being a bib:SpineTitle and know that it is strongly equivalent to the dc:Title that they have already been using. The community work that makes this all happen is one form of networking. The semantic mapping and data sharing across communities is another form of networking.
The other form of networking is the web of relations we create in our data when we use URIs to name things and use those URIs where we formerly used strings of content.  Instead of using the composer name "Dmitri Shostakovich", which is subject to many misspellings, we can now use a VIAF URI (http://viaf.org/viaf/89612684) to identify him and have a much greater chance of spotting other references to Dmitri when the same URI is used. Even when a different URI is used, there are ways to indicate that they identify the same person.  This weaving together of our data through URIs is yet another form of networking.
In our opinion, Schema.org is currently the best/simplest vocabulary to use as a starting point for marking up Linked Data. We are using it in all our new work, including the Virtual International Authority File (VIAF), WorldCat Identities and WorldCat.org.


Linked Data offers the potential for agencies and communities to publish information in a manner that permits far greater utility "in the flow" of the network. In particular, unexpected connections, uses and value may be realized by many parties, including parties with which the hosting/publishing agency might not normally have had contact.

OCLC Research is exploring Linked Data from a variety of angles—as a publisher, consumer, applications-builder, project partner, and through our involvement with Linked Data-related work with standards bodies like the W3C. This work is shaping and informing OCLC Research's and OCLC's thinking and direction with respect to our prototypes, experimental datasets, products and services.

Related Projects

OCLC Research Projects

OCLC Production Projects


More Information

Most recent updates: Page content: 2016-04-04