Optimizing the Performance of Concurrent RDF Stream Processing Queries
Refereed Conference Meeting Proceeding
With the growing popularity of Internet of Things (IoT) and sensing technologies, a large number of data streams are being generated at a very rapid pace. To explore the potentials of the integration of IoT and semantic technologies, a few RDF Stream Processing (RSP) query engines have emerged, which are capable of processing, analyzing and reasoning over semantic data streams in real-time. RSP mitigates data interoperability issues and promotes knowledge discovery and smart decision making for time-sensitive applications. However, a major hurdle in the wide adoption of RSP systems is their query performance. Particularly, the ability of RSP engines to handle a large number of concurrent queries is very limited which refrains large scale stream processing applications (e.g. smart city applications) to adopt RSP. In this paper, we propose a shared join based approach to improve the performance of an RSP engine for concurrent queries. We also leverage query federation mechanisms to allow distributed query processing over multiple RSP engine instances. We apply load balancing strategies to distribute queries and further optimize the concurrent query performance. We provide a proof of concept implementation by extending CQELS RSP engine and evaluate our approach using existing benchmark datasets for RSP. We also compare the performance of our proposed approach with the state of the art implementation of CQELS RSP engine.
Extended Semantic Web Conference 2017
Digital Object Identifer (DOI):
National University of Ireland, Galway (NUIG)
Open access repository: