You are here

Dimensionality reduction in data mining: A Copula approach

Authors: 

R Houari, A Bounceur, Tahar Kechadi, K Tari, Euler

Publication Type: 
Refereed Original Article
Abstract: 
The recent trends in collecting huge and diverse datasets have created a great challenge in data analysis. One of the characteristics of these gigantic datasets is that they often have significant amounts of redundancies. The use of very large multi-dimensional data will result in more noise, redundant data, and the possibility of unconnected data entities. To efficiently manipulate data represented in a highdimensional space and to address the impact of redundant dimensions on the final results, we propose a new technique for the dimensionality reduction using Copulas and the LU-decomposition (Forward Substitution) method. The proposed method is compared favorably with existing approaches on real-world datasets: Diabetes, Waveform, two versions of Human Activity Recognition based on Smartphone, and Thyroid Datasets taken from machine learning repository in terms of dimensionality reduction and effi- ciency of the method, which are performed on statistical and classification measures.
Digital Object Identifer (DOI): 
10.1016/j.eswa.2016.07.041
Publication Status: 
Published
Publication Date: 
01/01/2016
Journal: 
Expert Systems with Applications
Institution: 
National University of Ireland, Dublin (UCD)
Open access repository: 
No