DeXIN (Distributed extended XQuery for data INtegration)

DeXIN is a loosely coupled data integration tool which integrates multiple, heterogeneous, highly distributed and rapidly changing web data sources in different formats, e.g. XML, RDF and relational data. DeXIN is a data integration web service which integrates heterogeneous distributed data sources, including data services (DaaS – data as a service). At the heart of DeXIN is an XQuery extension that allows users/applications to execute a single query against distributed, heterogeneous web data sources or data services.

DeXIN Architecture

 

XQuery extension to SPARQL

DeXIN is an extensible framework based on a multi-lingual and multi-database architecture to deal with various data formats and various query languages. It uses a distinguished data format as “aggregation model” together with an appropriate query language for data in this format. So far, we are using XML as aggregation model and XQuery as the corresponding query language. This aggregation model can then be extended to other data formats (like RDF/OWL) with other query languages (like SPARQL). In order to execute SPARQL queries inside XQuery, it suffices to introduce a new function called SPARQLQuery(). This function can be used anywhere in XQuery where a reference to an XML document may occur. This approach is very similar to the extension of SQL via the XMLQuery function in order to execute XQuery inside SQL. The new function SPARQLQuery() is defined as follows: 

XMLDOC SPARQLQuery(String sparqlQuery,URI sourceURI)

 The value returned by a call to this function is of type XMLDOC. The function SPARQLQuery() has two parameters: The first parameter is of type String and contains the SPARQL query that has to be executed. The second parameter is of type URI and either contains the URI or just the name of the data source on which the SPARQL query has to be executed. The name of the data source refers to an entry in the database of known data sources maintained by the Metadata Manager. If the indicated data source is reachable and the SPARQL query is successfully executed, then the result is wrapped into XML according to the W3C Proposed Recommendation. A query in extended XQuery format retrieving the desired information is shown:

  

for  $a in doc(“Peer1/License.xml”)/agreement,

       

$b in SPARQLQuery(SELECT ?title ?ExecutionTime

       

WHERE 

    { ?x <http://www.w3c.org/2001/sub#title> ?title.

        ?x <http://www.w3c.org/2001/sub#time >   ?ExecutionTime}”, Peer2/Qos.rdf) /result

 

WHERE

          $a/Servicetitle = $b/title AND $a/peruse  > 1

 

RETURN

<Result>

  ……….

<Result>

  An example extended XQuery for DeXIN

 The query tree returned by the Parser has to be traversed in order to search for all calls of the SPARQLQuery() function. Suppose that we have n such calls. For each call of this function, the Query Decomposer retrieves the SPARQL query qi and the data source di on which the query qi shall be executed. The result of this process is a list {(q1, d1), . . . , (qn, dn)} of pairs consisting of a query and a source. The Executor then poses each query qi against the data source di. The order of the execution of these queries and possible parallelization have to take the dependencies between these queries into account. If the execution of each query qi was successful, its result is transferred to the site where DeXIN is located and converted into XML-format. The resulting XML-document ri is then stored temporarily. Moreover, in the query tree received from the Parser, the call of the SPARQLQuery() function with query qi and data source di is replaced by a reference to the XML-document ri. The resulting query tree is a query tree of pure XQuery without any extensions. It can thus be executed locally by the XQuery engine used by DeXIN.

Implementation: 

DeXIN Demo

Source Code :

 

Publications: