Working with Graphs Without a SPARQL Endpoint

In my last blog post, I talked about using the FROM statement in order to query data from specific URIs rather than an entire dataset. In this post, I'll talk a bit in more detail about the techniques for performing SPARQL when a specific SPARQL endpoint for a dataset is not available for the data.

Using FROM

The first technique for interacting with data that doesn't have a SPARQL endpoint is to use FROM statements within the SPARQL. To do this, you need a SPARQL processor. For demonstration purposes, I use Fuseki version 1.3 for this. Fuseki 1.3 comes with an HTML interface for passing in SPARQL queries. Also, a version of it is deployed at http://www.sparql.org/sparql.html.

Using FROM, you can specify within the query which URIs you want to include data from. Below is an example query that interacts with a WorldCat work entity and all the related bibliographic data graphs.

PREFIX schema: <http://schema.org/>
SELECT ?isbn
FROM <http://experiment.worldcat.org/entity/work/data/67201841>
FROM <http://www.worldcat.org/oclc/660967222>
FROM <http://www.worldcat.org/oclc/326799313>
FROM <http://www.worldcat.org/oclc/886700555>
FROM <http://www.worldcat.org/oclc/431591368>
FROM <http://www.worldcat.org/oclc/188235414>
FROM <http://www.worldcat.org/oclc/234290815>
FROM <http://www.worldcat.org/oclc/718524646>
FROM <http://www.worldcat.org/oclc/154684429>
FROM <http://www.worldcat.org/oclc/756523287>
FROM <http://www.worldcat.org/oclc/421705147>
FROM <http://www.worldcat.org/oclc/481822502>
FROM <http://www.worldcat.org/oclc/886581845>
FROM <http://www.worldcat.org/oclc/804438865>
FROM <http://www.worldcat.org/oclc/493390155>
FROM <http://www.worldcat.org/oclc/876558200>
FROM <http://www.worldcat.org/oclc/884006311>
FROM <http://www.worldcat.org/oclc/768470693>
FROM <http://www.worldcat.org/oclc/354466211>
FROM <http://www.worldcat.org/oclc/255474401>
FROM <http://www.worldcat.org/oclc/82671871>
FROM <http://www.worldcat.org/oclc/754004270>
FROM <http://www.worldcat.org/oclc/850798262>
FROM <http://www.worldcat.org/oclc/717007464>
FROM <http://www.worldcat.org/oclc/439080694>
FROM <http://www.worldcat.org/oclc/768120530>
FROM <http://www.worldcat.org/oclc/840398604>
WHERE
{
 ?s schema:isbn ?isbn.
}

The query returns all the ISBNs related to the WorldCat work by gathering them from the related bibliographic records. This technique is only effective if you know the URIs you need to interact with ahead of time. Also, while you could run Fuseki purely and a proxy SPARQL endpoint for your code to interact with, performing the SPARQL within your own code may be a more efficient approach.

Loading Data in Memory

One way to do this is to load the desired data into a local repository in memory and perform SPARQL on it. This is really what Fuseki is doing behind the scenes. Unfortunately, SPARQLing a repository in memory isn't possible in every programming language. The EasyRDF PHP library for working with linked data doesn't support this feature. However, the Ruby libraries for interacting with linked data do support this. Below is an example that exactly replicates the SPARQL FROM query done above.

url = 'http://experiment.worldcat.org/entity/work/data/67201841' 
resource = RestClient::Resource.new url 
response = resource.get(:accept => 'application/rdf+xml') 
 
work_group_store = RDF::Repository.new.from_rdfxml(response) 
work_examples = Spira.repository.query(:subject => self.id, 
:predicate => "http://schema.org/workExample") 

query = 'PREFIX schema: <http://schema.org/> 
SELECT ?isbn 
FROM <http://experiment.worldcat.org/entity/work/data/67201841>'

work_examples.each{|work_example| work_example.subject.to_s) 
 
work_examples.each{|work_example| 
    query += 'FROM ' + work_example
}

query += 'WHERE 
{ 
<http://worldcat.org/entity/work/id/67201841> schema:workExample ?bib. 
?bib schema:workExample ?productModel. 
?productModel schema:isbn ?isbn. 
}'

results = SPARQL.execute("SELECT ?isbn 
WHERE {?s <http://schema.org/isbn> ?isbn.}", work_group_store) 
isbns = results.map{|result| result.isbn.value}

 

A More Realistic Ruby Example

It is important to note that the Ruby example above starts with a WorldCat work URI and uses that to obtain all the necessary related URIs for related bibliographic data. It builds a SPARQL query on the fly that contains all the necessary URIs for the bibliographic data graphs. This isn't the most efficient way to perform this task in Ruby. The more efficient approach loads the WorldCat work graph into a memory triple store, gets the URIs for corresponding bibliographic data graphs, iterates through those graphs and loads them into the memory triple store, and then queries that memory triple store for the ISBNs. That code looks like this.

url = 'http://experiment.worldcat.org/entity/work/data/67201841'
resource = RestClient::Resource.new url
response = resource.get(:accept => 'application/rdf+xml')

work_group_store = RDF::Repository.new.from_rdfxml(response)
work_examples = Spira.repository.query(:subject => self.id, 
:predicate => "http://schema.org/workExample")

work_examples.map{|work_example| work_example.subject.to_s)

work_examples.each{|work_example|
    work_example_resource = RestClient::Resource.new work_example
    work_example_response = resource.get(:accept => 'application/rdf+xml')
    RDF::Reader.for(:rdfxml).new(work_example_response) do |reader|
    reader.each_statement do |statement|
        work_group_store.insert(statement)
    end
    end
}
results = SPARQL.execute("SELECT ?isbn WHERE {?s <http://schema.org/isbn> ?isbn.}", 
work_group_store)
isbns = results.map{|result| result.isbn.value}

While these approaches are different from one another, the important takeaway is that just because a SPARQL endpoint doesn't exist doesn't mean the data isn't SPARQLable. Loading the data into memory on the fly and SPARQLing it can be a viable alternative if the dataset you are working with is small enough.

In my next post, I will talk about using FILTER, OPTIONAL and UNION within SPARQL.

Register for our upcoming Linked Data webinars!

  • Karen Coombs

    Karen Coombs

    Senior Product Analyst