Exploratory data analysis in the context of data mining and resampling

Today there are quite a few widespread misconceptions of exploratory data analysis (EDA). One of these misperceptions is that EDA is said to be opposed to statistical modeling. Actually, the essence of EDA is not about putting aside all modeling and preconceptions; rather, researchers are urged not...

Full description

Autores:
Ho Yu, Chong
Tipo de recurso:
Fecha de publicación:
2010
Institución:
Universidad de San Buenaventura
Repositorio:
Repositorio USB
Idioma:
spa
OAI Identifier:
oai:bibliotecadigital.usb.edu.co:10819/6448
Acceso en línea:
http://hdl.handle.net/10819/6448
Palabra clave:
Exploratory data analysis
Data mining
Resampling
Cross-validation
Data visualization
Clustering
Classification trees
Neural networks
Análisis exploratorio de datos
Minería de datos
Remuestreo
Validación cruzada
Visualización de datos
Agrupación
Árboles de clasificación
Redes neuronales
Analysis of data
Statistics
Análisis de datos
Estadística
Rights
License
Atribución-NoComercial-SinDerivadas 2.5 Colombia
Description
Summary:Today there are quite a few widespread misconceptions of exploratory data analysis (EDA). One of these misperceptions is that EDA is said to be opposed to statistical modeling. Actually, the essence of EDA is not about putting aside all modeling and preconceptions; rather, researchers are urged not to start the analysis with a strong preconception only, and thus modeling is still legitimate in EDA. In addition, the nature of EDA has been changing due to the emergence of new methods and convergence between EDA and other methodologies, such as data mining and resampling. Therefore, conventional conceptual frameworks of EDA might no longer be capable of coping with this trend. In this article, EDA is introduced in the context of data mining and resampling with an emphasis on three goals: cluster detection, variable selection, and pattern recognition. TwoStep clustering, classification trees, and neural networks, which are powerful techniques to accomplish the preceding goals, respectively, are illustrated with concrete examples.