Estela Narvaez

Advisor: Prof. Fabrizio Angiulli

Topic: Data mining techniques for large and complex data

Abstract: The huge amount of data being collected and stored in repositories and sites everywhere across the globe poses the need for efficient and effective data analysis techniques. Data mining consists of a set of techniques that can be used to extract relevant and interesting knowledge from data. The most relevant tasks faced by data mining are association rule mining, clustering, classification and regression, and outlier detection. Classification techniques are supervised learning techniques that classify data items into a set predefined class labels. It is one of the most useful approaches to build prediction models from an input data set. One of the directions of research in this field concerns the ability to enhance classifier accuracy and/or reducing the complexity of the model. Within this scenario I’m investigating techniques to improve generalization and to prevent the induction of overly complex models in the context of nearest neighbor condensing classification techniques. In order to deal with more complex forms of data, data mining techniques working on data represented in the form of graphs are extremely useful. So, my interests are also directed towards the design of analysis tools for network structured data, such as the discovery of anomalies within a network.