Executing a SPARQL Query from QueryPath
Submitted by matt on Thu, 2009-05-28 18:07 The Semantic Web. It is a concept that has sparked heated debate for years. While the debate may continue to rage for some time, there are already a host of technologies that can be used to build advanced applications based on XML technology. In this article, we will see how the SPARQL query language can be used to retrieve XML information from remote semantic databases (usually called SPARQL endpoints).QueryPath already contains all of the tools necessary for running a SPARQL query and handling the results. This is not because QueryPath has been specially fitted to the task, but because SPARQL uses technologies that are widely supported: XML and HTTP. Since QueryPath can be used to make HTTP requests and then digest the XML results, we can use it to execute SPARQL queries and handle the results.
In this article, we will look at a basic SPARQL query, and see how we can use QueryPath to execute it and parse the returned results.
While SPARQL will be introduced here, it is far too robust a language to be explained in a short article. One starting point is the SPARQL Working Group home page.
The queries presented in this chapter will be run against DBPedia, a semantic version of Wikipedia. It makes all of the content from Wikipedia available as semantic content.
The SPARQL Query: A Brief Anatomy
Let's begin by looking at the SPARQL query that we will be running:PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?uri ?name ?label
WHERE {
?uri foaf:name ?name .
?uri rdfs:label ?label
FILTER (?name = "The Beatles")
FILTER (lang(?label) = "en")
}PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
foaf:) and one for the RDF Schema namespace (rdfs). Now, whenever we need to represent entities from those two schemata, we can just use the short prefix instead of the full URL.The next part of the code above is the actual query:
SELECT ?uri ?name ?label
WHERE {
?uri foaf:name ?name .
?uri rdfs:label ?label
FILTER (?name = "The Beatles")
FILTER (lang(?label) = "en")
}If you have developed SQL before, this should look vaguely familiar. It functions similarly to a SQL
SELECT operation. Here's what the code above does, phrased in plain English:- Select the uri, name, and label
- where...
- the uri has the name ?name (or, where the uri's name is stored in ?name)
- the uri has a label ?label
- the name is "The Beatles"
- the language of the label is English
First, remember that the URI (?uri), is just a unique identifier. It is functioning sort of like a primary key for each object we query.
Second, the items that begin with question marks (?) are variables. Their value is assigned when the query is being executed.
Third, the items in the
WHERE clause are not simply restrictive, as they are in SQL. In fact, the purpose of lines 3 and 4 isn't so much to limit the items returned, but to express a relationship between items. The general pattern of lines 3 and 4 is:?subject ?relationship ?object
?uri foaf:name ?name can be understood to mean "Some object ID (subject) named (relationship) Some name(object)". As you may have guessed, foaf:name expresses the relationship "is named". Likewise, rdfs:label expresses the relationship "is labeled".Assuming that we did not have the two
FILTER functions, the query would simply return all objects (together with their names and labels) that had a name and a label.The
FILTER function is used to limit what content is returned. Above, we used two filters:FILTER (?name = "The Beatles") FILTER (lang(?label) = "en")
?name must match (exactly) the string "The Beatles". Keep in mind that a given item may have multiple foaf:name items. The filter need only match one of the items.The second filter requires that the label's language be in English. RDFS labels in the DBPedia database tend to have attributes indicating the language of the label. We are only interested in the English language content. In the query above, if we omit this, we will see results in Chinese, German, and Spanish, as well as other languages.
Putting this all together, then, our query will return the URI, the name, and the label for any URIs in the database that...
- Have a name
- Have a label
- Have a name that is "The Beatles"
- Have a label that is in English.
Running the Query
The query is, by far, the most complex aspect of our sample code. Here's what the entire code looks like:<?php require '../src/QueryPath/QueryPath.php'; // We are using the dbpedia database to execute a SPARQL query. // URL to DB Pedia's SPARQL endpoint. $url = 'http://dbpedia.org/sparql'; // The SPARQL query to run. $sparql = ' PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?uri ?name ?label WHERE { ?uri foaf:name ?name . ?uri rdfs:label ?label FILTER (?name = "The Beatles") FILTER (lang(?label) = "en") } '; // We first set up the parameters that will be sent. $params = array( 'query' => $sparql, 'format' => 'application/sparql-results+xml', ); // DB Pedia wants a GET query, so we create one. $data = http_build_query($params); $url .= '?' . $data; // Next, we simply retrieve, parse, and output the contents. $qp = qp($url, 'head'); // Get the headers from the resulting XML. $headers = array(); foreach ($qp->children('variable') as $col) { $headers[] = $col->attr('name'); } // Get rows of data from result. $rows = array(); $col_count = count($headers); foreach ($qp->top()->find('results>result') as $row) { $cols = array(); $row->children(); for ($i = 0; $i < $col_count; ++$i) { $cols[$i] = $row->branch()->eq($i)->text(); } $rows[] = $cols; } // Turn data into table. $table = '<table><tr><th>' . implode('</th><th>', $headers) . '</th></tr>'; foreach ($rows as $row) { $table .= '<tr><td>'; $table .= implode('</td><td>', $row); $table .= '</td></tr>'; } $table .= '</table>'; // Add table to HTML document. qp(QueryPath::HTML_STUB, 'body')->append($table)->writeHTML(); ?>
Nessun commento:
Posta un commento