Generating SPARQL queries using rules
© Jean-Marc Vanel - $Date: 2012-02-13$ - under Creative Commons License by-nc-nd 3.0
TOC
Rules are not only for business rules, they can also manipulate other rules in a useful way.
In fact business rules are raw material for the building of business applications, just like ontologies and structural models. And, because we build business applications using rules (applicative and infrastructure rules), this means having rules manipulating rules.
We already have such rules manipulating rules to :
We need to have more rules manipulating rules to :
Point 1 will stay like this in the near future, as current implementation is well tested and extensible.
The point 2 is necessary for automating access to Linked Open Data (see Roadmap for EulerGUI 2.1 , and Sketch of a roadmap / The Déductions stuff in my blog).
We need to design and implement a backward engine for querying SPARQL databases; it will generate a big SPARQL query by recursively accumulating criterium terms, substituting bound variables, renaming variables if necessary.
As an exemple, think of a family relationship rule base (see rules examples - Family relationships ). The intermediary steps are writen below in N3, but keep in mind that the actual inferencing will manipulate flattened N3 rule syntax, as explained below .
Starting from a simplified N3 query :
:John :hasUncle ?U
By 3, this rule is picked:
{ ?X hasParent ?P . ?P hasBrother ?B . } => { ?X hasUncle ?Y }.
By 3, substitute variable ?X with its value :John :{ :John hasParent ?P .
?P hasBrother ?B .
} => {
:John hasUncle ?Y }.
By 4, the rule for hasBrother is picked, and the antecedent is substituted in preceding rule :
{ :John hasParent ?P . ?P hasSibling ?B .
?B a Male.
} => {
:John hasUncle ?Y }.
For property hasParent
, a background declaration states that
the corresponding triples are hosted in some SPARQL endpoint:
hasParent eg:inSPARQL_endpoint <http://mydatasource.org/> .
So, no need to recurse on property hasParent.
Similarly, for class Male
, , a background declaration states
that the corresponding triples are hosted in some SPARQL endpoint:
Male eg:inSPARQL_endpoint <http://mydatasource.org/> .
Process goes on with expanding hasSibling .
When nothing is left to be expanded, it's time to use point 2 above
to generate the final SPARQL query.
I used to think that for point 2, the capabilities of Euler or CHR to handle unbound variables can be leveraged. But I changed my mind. Since I already have experience with dealing with SWRL, it ' s probably quite easy to first convert the N3 to SWRL. Good thing with SWRL is, it's plain RDF, and as a consequence there's no possible confusion between input rules and transformation rules (this is actually a problem when processing N3 with N3). It's also easier to match rule parts when the rule is expressed as pure RDF graph (SWRL or other). So SWRL just a quite simple RDF rule format, enough well documented (but I din't have to read the doc. to write the SWRL ==> N3 converter rules ). N3 is suitable for writing rules by hand and proof-reading by developers (and directly interpretation by any of the EulerGUI.
Then the third idea, that looks even better, is to use the reified format generated by CWM.
Following Zeno's suggestion, I think reified N3 is better that SWRL as an intermediary format. First, the conversion is already done by CWM. Second, reified N3 is a direct representation of the original N3 rules.
As an example, here is a simple rule :
% cat examples/BloodPressure.n3 @prefix log: <http://www.w3.org/2000/10/swap/log#> . @prefix math: <http://www.w3.org/2000/10/swap/math#>. @prefix : <http://eulergui.sourceforge.net/examples#> . { :BloodPressure :val ?x. ?x math:greaterThan 70 } => { :Service112 :alert "true" } . :BloodPressure :val 72 .
This is how CWM reduces it to plain triples ( no more quoted graphs {} ) :
cwm examples/BloodPressure.n3 --reify ... @forSome Blo:_g0 . [ a log:Truth; :existentials [ owl:oneOf () ]; :statements [ owl:oneOf ( [ :object [ :existentials [ owl:oneOf () ]; :statements [ owl:oneOf ( [ :object [ :value "true" ]; :predicate [ :uri "http://eulergui.sourceforge.net/examples#alert" ]; :subject [ :uri "http://eulergui.sourceforge.net/examples#Service112" ] ] ) ]; :universals [ owl:oneOf () ] ]; :predicate [ :uri "http://www.w3.org/2000/10/swap/log#implies" ]; :subject [ :existentials [ owl:oneOf () ]; :statements [ owl:oneOf ( [ :object Blo:_g0; :predicate [ :uri "http://eulergui.sourceforge.net/examples#val" ]; :subject [ :uri "http://eulergui.sourceforge.net/examples#BloodPressure" ] ] [ :object [ :value 70 ]; :predicate [ :uri "http://www.w3.org/2000/10/swap/math#greaterThan" ]; :subject Blo:_g0 ] ) ]; :universals [ owl:oneOf () ] ] ] [ :object [ :value 72 ]; :predicate [ :uri "http://eulergui.sourceforge.net/examples#val" ]; :subject [ :uri "http://eulergui.sourceforge.net/examples#BloodPressure" ] ] ) ]; :universals [ owl:oneOf ( Blo:_g0 ) ] ]. Blo:_g0 a :BlankNode .
So the roadmap for this SPARQL generation feature could be :
already done by CWM, but need to have an EulerGUI pipeline with CWM as first step, and probably Euler as second step (not yet implemented )