You are here

How Representative is a SPARQL Benchmark? An Analysis of RDF Triplestore Benchmarks


Muhammad Saleem, Gábor Szárnyas, Felix Conrads, Syed Ahmad Chan Bukhari, Qaiser Mehmood, Axel-Cyrille Ngonga Ngomo

Publication Type: 
Refereed Conference Meeting Proceeding
Triplestores are data management systems for storing and querying RDF data. Over recent years, various benchmarks have been proposed to assess the performance of triplestores across different performance measures. However, choosing the most suitable benchmark for evaluating triplestores in practical settings is not a trivial task. This is because triplestores experience varying workloads when deployed in real applications. We address the problem of determining an appropriate benchmark for a given real-life workload by providing a fine-grained comparative analysis of existing triple- store benchmarks. In particular, we analyze the data and queries provided with the existing triplestore benchmarks in addition to several real-world datasets. Furthermore, we measure the correlation between the query execution time and various SPARQL query features and rank those features based on their significance levels. Our experiments reveal several interesting insights about the design of such benchmarks. With this fine-grained evaluation, we aim to support the design and implementation of more diverse benchmarks. Application developers can use our result to analyze their data and queries and choose a data management system.
Conference Name: 
The WEB Conference (WWW)
Proceedings of ACM Conference (TheWebConf19)
Digital Object Identifer (DOI):
Publication Date: 
Conference Location: 
United States of America
Research Group: 
National University of Ireland, Galway (NUIG)
Open access repository: 
Publication document: