You are here

Automating Data Mart Construction from Semi-structured Data Sources

Authors: 

Michael Scriney, Suzanne McCarthy, Andrew McCarren, Paolo Cappellari, Mark Roantree

Publication Type: 
Refereed Original Article
Abstract: 
The global food and agricultural industry has a total market value of USD 8 trillion in 2016, and decision makers in the Agri sector require appropriate tools and up-to-date information to make predictions across a range of products and areas. Traditionally, these requirements are met with information processed into a data warehouse and data marts constructed for analyses. Increasingly however, data are coming from outside the enterprise and often in unprocessed forms. As these sources are outside the control of companies, they are prone to change and new sources may appear. In these cases, the process of accommodating these sources can be costly and very time consuming. To automate this process, what is required is a sufficiently robust extract–transform–load process; external sources are mapped to some form of ontology, and an integration process to merge the specific data sources. In this paper, we present an approach to automating the integration of data sources in an Agri environment, where new sources are examined before an attempt to merge them with existing data marts. Our validation uses a case study of real world Agri data to demonstrate the robustness of our approach and the efficiency of materializing data marts.
Digital Object Identifer (DOI): 
10.1093/comjnl/bxy064
Publication Status: 
Published
Publication Date: 
15/06/2018
Journal: 
The Computer Journal
Research Group: 
Institution: 
Dublin City University (DCU)
Open access repository: 
No