The availability of Linked Data in WorldCat.org has everyone here very excited. We had been anxiously awaiting the chance to make use of this new feature in our own applications, and are now beginning to try it out.
You may have already seen the bookmarklet developed by OCLC Developer Network staff that show how to extract schema.org markup to send information to Goodreads or to a Patron Drive Acquisitions system that accepts data via OpenURL, or the bookmarklet that extracts author URIs and uses those to query VIAF for links to DBPedia. If not, you should definitely check those out.
Similar to the author URI bookmarklet, we also recently added an experimental new display to the WorldCat Facebook app. If you select the “related people and topics” link for a title in a search result, you’ll see a display of descriptive metadata from the schema.org markup, as well as links to people that have Wikipedia pages and related topics from Wikipedia. We’re using two other OCLC Linked Data services, VIAF for people and FAST for topics, as well as DBPedia and the WikiMedia API, to make those connections.
The WorldCat Facebook app is written in PHP. So we made use of the very handy ARC2 PHP semantic web library for this implementation. We start by formulating a URL for a worldcat.org page using the OCLC number, e.g., http://worldcat.org/oclc/57352812. We then use these ARC2 and other PHP methods to parse the RDFa semantic web markup from the retrieved HTML and create a JSON array of the results:
$config = array('auto_extract' => 0);
$parser = ARC2::getSemHTMLParser($config);
$triples = $parser->getTriples();
$json = $parser->toRDFJSON($triples);
$jsonobj = json_decode($json,true);
Once we have the semantic markup in an array, it’s pretty easy to obtain particular values of interest. For example, we can look in the JSON array for “schema:description” elements to get bibliographic notes. We look at other elements in the JSON array for any VIAF or FAST links that were returned.
To get Wikipedia headings for VIAF URIs, we first call VIAF to get its linked data RDF XML response. VIAF linked data with the tag “OWL:SAMEAS” can be checked for links to “dbpedia”, a Linked Data representation of Wikipedia. And the DBPedia links can then be called to obtain Wikipedia abstracts, thumbnail images and related links.
We try to get related Wikipedia headings for FAST URIs as well, though this approximation is still very experimental and can lead to some inaccurate results. We use the preferred label of the FAST response as a keyword search in the WikiMedia API to find related works. We take the first match, and use its DBPedia linked data to retrieve its preferred heading, thumbnail, abstract and links. Sometimes it’s a relevant match for the FAST heading, sometimes not. That’s something we’ll continue to experiment with.
The source PHP code is available in the DevNet SVN repository. Take a look at its README.txt file for installation details.
Meanwhile, there are plenty of other interesting ideas to try out, and we’re looking forward to hearing about experiments and apps built by you as members of the OCLC Developer Network. Keep us posted!