Consuming Linked Data Using JavaScript
In our series on Learning Linked Data, we've covered several topics related to querying linked data using some SPARQL features and techniques, producing linked data using JSON-LD and, in our last post, using JavaScript to consume linked data.
This post will focus on consuming linked data using client-side libraries, such as JavaScript. There are a couple of reasons why using client-side libraries to interact with linked data is useful. The primary reason is that by manipulating linked data using a client-side library, the user interface can more easily expand the scope of the graph without having to reload the whole display. Additionally, JavaScript is also very useful when using SPARQL to query graphs and display results.
How to do it
Given these advantages, what is necessary to consume linked data using client-side scripts?
The most basic steps to consuming linked data are:
- Load the data.
- Parse the data returned into a graph.
- Traverse the graph to display the data.
For readers interested in a greater level of detail beyond the snippets in this post, view the full code for the simple demonstration application that allows a graph associated with a given bibliographic record or a given work to be displayed.
Loading the Data
The first step in the process of consuming linked data is to load the data. This step has two aspects. The first aspect of loading data is retrieving data from a different server than the one where the client-side script is located. Normally, browsers prevent this type of cross-server scripting. However, CORS (Cross-Origin Resource Sharing) makes this possible. A linked data set that supports CORS can be easily accessed via JavaScript. If a set of linked data has CORS enabled, then it will return an Access-Control-Allow-Origin header. If a linked data set doesn't properly support CORS, then it can't be interacted with solely using JavaScript. A proxy of some sort is necessary to avoid XSS (cross-site scripting) errors. For the code techniques I'm demonstrating here, the server with the data has to support CORS.
The second aspect of loading the data is getting the data in a format that is parseable. Linked data comes in several different serializations, and not all libraries can parse the different serializations. Most linked data sets support several different serializations, but some do not. This can present an issue if the data returned isn't in a serialization the parsing library understands. Often, to get a particular serialization, a client needs to perform "content-negotiation." This is done by sending an HTTP request with a specific Accept header. In the example below, I'm using the HTTP client within AngularJS to make the request to WorldCat for a graph for a specific OCLC number in the RDF/XML serialization.
Example
request_url = 'http://experiment.worldcat.org/oclc/7977212.rdf'; $http({ method: 'GET', url: request_url, headers: { 'Accept': 'application/rdf+xml' }, }).then(function successCallback(response) { //create the graph and load the data }, function errorCallback(response) { alert('Failed'); console.log(response); });
Parsing the Data
Once the data is loaded, it then has to be parsed. It is necessary to do this a linked data library. Otherwise, JavaScript will treat the data returned as text, XML or JSON, depending on the serialization being retrieved. There are a few JavaScript libraries useful for parsing linked data. The library I've found the most useful is rdflib.js. This library enables clients to load linked data serialized as RDF/XML, Turtle, N-triples, or RDFa. Unfortunately, the library isn't capable of parsing JSON-LD. Currently, there aren't any good options for parsing JSON-LD into a graph that can be queried or traversed. However, because many linked data sources support multiple serializations, this isn't an issue.
In the example below, a graph object is being created, and then data being parsed into it. Note that I have to tell the library the format of the data I've retrieved. In this case, "application/rdf+xml."
Example
uri = 'http://www.worldcat.org/oclc/7977212'; kb = $rdf.graph(); $rdf.parse(response.data, kb, uri, 'application/rdf+xml');
Traverse the Graph
Once the data is parsed, then rdflib.js can be used to access specific properties within the graph. The rdflib.js library uses the linked data concepts of subject, object and predicate to allow particular properties to be selected using the URI for the subject and the URI for the predicate.
The example below starts with a graph and extracts several values from it, including
- name
- author, and
- subjects.
Example
$scope.name = kb.the($rdf.sym(uri), $rdf.sym('http://schema.org/name')).value; $scope.author = kb.the(kb.the($rdf.sym(uri), $rdf.sym('http://schema.org/creator')), $rdf.sym('http://schema.org/name')).value; subjectNodes = kb.each($rdf.sym(uri), $rdf.sym('http://schema.org/about')); subjects = []; for (i = 0; i < subjectNodes.length; i++) { if (kb.the(subjectNodes[i], $rdf.sym('http://schema.org/name'))) { subjects.push(kb.the(subjectNodes[i], $rdf.sym('http://schema.org/name'))); } } $scope.subjects = subjects;
I'm doing some extra parsing and manipulation here to the subjects because I want the subject names in an array.
Once I have all these values in the $scope variable, I can easily print them to the screen.
Loading Data on the Fly
The example I've chosen thus far retrieves a specific graph and displays aspects of it to the user. But client-side linked data manipulation becomes more powerful when it is used to expand the graph or to load multiple graphs on the fly. Let's take a close look at a specific example of this. I want my application to show me more detailed information about an author from VIAF on the fly. To do this, I'm going to use ngDialog to make a nice popup box. Ng-click triggers the "openAuthor" function that makes a request to VIAF for the additional information and loads it into a dialog box.
Example
$scope.openAuthor = function () { uri = $scope.author_id; request_url = 'http://localhost:3000/' + uri + '/rdf.xml'; var SCHEMA = $rdf.Namespace("http://schema.org/") kb = $rdf.graph(); $http({ method: 'GET', url: request_url, headers: { 'Accept': 'application/rdf+xml' }, }).then(function successCallback(response) { $rdf.parse(response.data, kb, uri, 'application/rdf+xml'); $scope.names = kb.each($rdf.sym(uri), SCHEMA('name')); $scope.birthDate = kb.the($rdf.sym(uri), SCHEMA('birthDate')).value; $scope.deathDate = kb.the($rdf.sym(uri), SCHEMA('deathDate')).value; }, function errorCallback(response) { alert('Failed'); console.log(response); }); ngDialog.open({ template: 'author-template.html', className: 'ngdialog-theme-default', scope: $scope }); };
General Observations
When I started working with linked data in JavaScript, I didn't think it would be remarkably different from working with linked data in other programming languages. However, after creating my first application, I realized there were some significant differences and hurdles to consuming linked data using client-side JavaScript. The two largest hurdles turned out to be things that were beyond my control as a developer.
CORS Support
The first hurdle was CORS support in linked data endpoints. Most of the sets of open library linked data do not support CORS. This effectively renders them useless for JavaScript consumptions without a proxy. Fortunately, there are some simple and effective proxying options. CORS Anywhere is a module for node.js, which can be used to run your own proxy server very simply and effectively. It took me less than five minutes to setup my own proxy to use.
Alternatively, there are a few free hosted proxies that one might consider using for testing purposes.
For a production service, you'll probably want to host your own proxy to ensure it can handle the load your application will be creating.
Linked Data Libraries in JavaScript
The second hurdle is that the libraries for processing linked data using JavaScript are not as robust and well supported as those in other programming languages. The best list I could find of JavaScript libraries was from the RDFJS W3C community group.
After reviewing several, I narrowed it down to RDF-Ext and rdflib.js. While development on RDF-Ext is more active, the library is underdoing significant changes at this point in time. Also, the limited documentation and latest version of the code don't appear to be in sync. As a result, I decided to use rdflib.js. This library seems extremely powerful, but it lacks through documentation. My development efforts would have been greatly sped up by an API reference that listed the classes and methods along with their parameters. Instead, I spent a lot of time performing "trial and error" based on old or limited examples for the library.
Working with rdflib.js, I also discovered that it lacks JSON-LD support. At least I'm fairly sure it lacks support, based on my reading and testing. So if the data I wanted to consume was ONLY available in this format, rdflib.js would not parse it. I looked for another library to deal with JSON-LD and was able to find jsonld.js. This library has a reasonable amount of documentation but seems to be only able to read in and render JSON-LD in its different output formats: framed, compact, expanded, etc. It didn't meet my use case of pulling JSON-LD into a graph or a store object that I could query by subject, object or predicate.
Dependency Management
The last hurdle for me was effectively using the code libraries I want to build my application. I decided to create an application using AngularJS, which loaded the data and enabled me to expand the graph at will. This meant I needed to load several code libraries: AngularJS, ngDialog and rdflib.js. Learning how to perform dependency management of JavaScript code libraries effectively and efficiently turned out to be its own challenge.
Final Thoughts
There are significant advantages to working with linked data using a client-side library. However, lack of CORS support and good, robust and well documented libraries make it a fairly challenging endeavor at this point in time. Clearly, I'm not the only individual interested in this topic, because the W3C has a community group specifically devoted to RDF and JavaScript (https://www.w3.org/community/rdfjs/). The latest version of RDF-Ext (0.3.0) also seems promising, though readable documentation with clear examples is needed. Hopefully, the community can help make the environment for linked data production and consumption with JavaScript friendlier by developing or extending tools and documentation in this area.
-
Karen Coombs
Senior Product Analyst