Generating SPARQL queries using rules

© Jean-Marc Vanel - $Date: 2012-02-13$ - under Creative Commons License by-nc-nd 3.0

TOC

1. Introduction: Rules manipulating rules

Rules are not only for business rules, they can also manipulate other rules in a useful way.

In fact business rules are raw material for the building of business applications, just like ontologies and structural models. And, because we build business applications using rules (applicative and infrastructure rules), this means having rules manipulating rules.

We already have such rules manipulating rules to :

We need to have more rules manipulating rules to :

  1. transform N3 rules into Drools rules (currently implemented by ad-hoc Java code)
  2. generate SPARQL queries from N3 rule base and a simplified N3 query (an N3 expresion with triples including variables but no braces)

Point 1 will stay like this in the near future, as current implementation is well tested and extensible.

The point 2 is necessary for automating access to Linked Open Data (see Roadmap for EulerGUI 2.1 , and Sketch of a roadmap / The Déductions stuff in my blog).

2. Generating SPARQL queries using rules

We need to design and implement a backward engine for querying SPARQL databases; it will generate a big SPARQL query by recursively accumulating criterium terms, substituting bound variables, renaming variables if necessary.

As an exemple, think of a family relationship rule base (see rules examples - Family relationships ). The intermediary steps are writen below in N3, but keep in mind that the actual inferencing will manipulate flattened N3 rule syntax, as explained below .

Starting from a simplified N3 query :

:John :hasUncle ?U 

By 3, this rule is picked:

{ ?X hasParent ?P .
  ?P hasBrother ?B .
} => {
  ?X hasUncle ?Y }.

By 3, substitute variable ?X with its value :John :

{ :John hasParent ?P .
  ?P hasBrother ?B .
} => {
  :John hasUncle ?Y }.

By 4, the rule for hasBrother is picked, and the antecedent is substituted in preceding rule :

{ :John hasParent ?P .  ?P hasSibling ?B .
  ?B a Male.
} => {
  :John hasUncle ?Y }.

For property hasParent, a background declaration states that the corresponding triples are hosted in some SPARQL endpoint:

hasParent eg:inSPARQL_endpoint <http://mydatasource.org/> .

So, no need to recurse on property hasParent.

Similarly, for class Male, , a background declaration states that the corresponding triples are hosted in some SPARQL endpoint:

Male eg:inSPARQL_endpoint <http://mydatasource.org/> .

Process goes on with expanding hasSibling .

When nothing is left to be expanded, it's time to use point 2 above to generate the final SPARQL query.

3. Implementation discussion

I used to think that for point 2, the capabilities of Euler or CHR to handle unbound variables can be leveraged. But I changed my mind. Since I already have experience with dealing with SWRL, it ' s probably quite easy to first convert the N3 to SWRL. Good thing with SWRL is, it's plain RDF, and as a consequence there's no possible confusion between input rules and transformation rules (this is actually a problem when processing N3 with N3). It's also easier to match rule parts when the rule is expressed as pure RDF graph (SWRL or other). So SWRL just a quite simple RDF rule format, enough well documented (but I din't have to read the doc. to write the SWRL ==> N3 converter rules ). N3 is suitable for writing rules by hand and proof-reading by developers (and directly interpretation by any of the EulerGUI.

Then the third idea, that looks even better, is to use the reified format generated by CWM.

Following Zeno's suggestion, I think reified N3 is better that SWRL as an intermediary format. First, the conversion is already done by CWM. Second, reified N3 is a direct representation of the original N3 rules.

As an example, here is a simple rule :

% cat  examples/BloodPressure.n3       

@prefix log: <http://www.w3.org/2000/10/swap/log#> .
@prefix math: <http://www.w3.org/2000/10/swap/math#>.
@prefix : <http://eulergui.sourceforge.net/examples#> .

{ :BloodPressure :val ?x.
  ?x math:greaterThan 70
} => { :Service112 :alert "true" } .

:BloodPressure :val 72 .

This is how CWM reduces it to plain triples ( no more quoted graphs {} ) :

cwm examples/BloodPressure.n3 --reify
 ...
     @forSome Blo:_g0 .
      [      a log:Truth;
             :existentials  [
                 owl:oneOf () ];
             :statements  [
                 owl:oneOf  (
                 [
                         :object  [
                             :existentials  [
                                 owl:oneOf () ];
                             :statements  [
                                 owl:oneOf  (
                                 [
                                         :object  [
                                             :value "true" ];
                                         :predicate  [
                                             :uri "http://eulergui.sourceforge.net/examples#alert" ];
                                         :subject  [
                                             :uri "http://eulergui.sourceforge.net/examples#Service112" ] ] ) ];
                             :universals  [
                                 owl:oneOf () ] ];
                         :predicate  [
                             :uri "http://www.w3.org/2000/10/swap/log#implies" ];
                         :subject  [
                             :existentials  [
                                 owl:oneOf () ];
                             :statements  [
                                 owl:oneOf  (
                                 [
                                         :object Blo:_g0;
                                         :predicate  [
                                             :uri "http://eulergui.sourceforge.net/examples#val" ];
                                         :subject  [
                                             :uri "http://eulergui.sourceforge.net/examples#BloodPressure" ] ]
                                 [
                                         :object  [
                                             :value 70 ];
                                         :predicate  [
                                             :uri "http://www.w3.org/2000/10/swap/math#greaterThan" ];
                                         :subject Blo:_g0 ] ) ];
                             :universals  [
                                 owl:oneOf () ] ] ]
                 [
                         :object  [
                             :value 72 ];
                         :predicate  [
                             :uri "http://eulergui.sourceforge.net/examples#val" ];
                         :subject  [
                             :uri "http://eulergui.sourceforge.net/examples#BloodPressure" ] ] ) ];
             :universals  [
                 owl:oneOf  (
                Blo:_g0 ) ] ].
    
    Blo:_g0     a :BlankNode .

4. Roadmap

So the roadmap for this SPARQL generation feature could be :

  1. transform N3 rules into flattened rules

    already done by CWM, but need to have an EulerGUI pipeline with CWM as first step, and probably Euler as second step (not yet implemented )

  2. transform one flattened rule into SPARQL;
    this needs an RDF vocabulary for SPARQL queries
  3. add recursivity by finding (flattened) rules matching a simplified N3 query
    substitute suitable variables with their constant value
  4. add recursivity by substituting matching triples in antecedent with their definition from a rule