Extending a Datalog engine with Dynamic Filtering

Type:

Bachelor

Student name:

none

Datalog is a simple yet universal knowledge representation formalism. Its main characteristic is that it is a rule-based formalism, where one can express simple if-then rules to describe derivations of knowledge. Further, datalog underlies many industrial data management systems, i.e. it is the core (and extends) the well-known relational datamodel and SQL as the respective industrial query language.
Models in datalog are expressed as so-called programs, i.e. a set of simple rules. These rules capture schematic knowledge about the models. An example is the following rules describing reachability in a graph.
(1) reachable(?x,?y) <= edge(?x, ?y)
(2) reachable(?x,?y) <= reachable (?x, ?z) and edge(?z, ?y)

Further, a datalog program contains so-called facts, which is assertional knowledge or a database respectively.
In our graph example, this could be the edges of a specific graph:
edge(bregenz, innsbruck)
edge(innsbruck, salzburg)
edge(innsbruck, brenner)
edge(innsbruck, muenchen)
edge(salzburg, linz)
edge(linz, steyr)
edge(linz, wien)
edge(muenchen, wien)
edge(wien, muenchen)
edge(muenchen, augsburg)
edge(augsburg, ulm)
located-in(muenchen, germany)
located-in(augsburg, germany)
located-in(ulm, germany)
located-in(innsbruck, austria)
...
The fundamental task to be solved by a datalog engine is query answering, which is database-style queries. An example is to ask what cities in Germany can be reached from Innsbruck in our specific model (i.e. the graph described above):
?- reachable(innsbruck, ?destination) and located-in(?destination, germany)
which would result in a range of answers (bindings for the variable ?destiantion in the query): muenchen, augsburg, ulm ...
There are various algorithms how to compute the result for a given query. All these algorithms have their own strengths and weaknesses. One of these algorithms is called Dynamic Filtering [1,2] which is elegant and provides an a basis for a highly-concurrent or even distributed implementation.
We are currently developing a datalog engine called IRIS [3] which implements a couple of standard evaluation strategies on top of common datastructures. The aimof this project is to implement dynamic filtering within the IRIS system and to evaluate the method wrt. to other algorithms.

Implementation will be done in Java.

The project provides some insight in some interesting topic in the intersection of knowledge representation and databases, allows you to learn about a specific elegant evaluation strategy for queries over datalog programs and to learn how to integrate it in an existing system. The focus of the project is design and implementation of a component providing a specific functionality, it is not theory work. Finally, one might want to think about optimizations of the implementation.
References
[1] A Framework for an Efficient Implementation of Deductive Databases
[2] SYGRAF: Implementing Logic Programs in a Database Style
[3] IRIS Reasoner:
Website