“Data Analysis on Parallel, Distributed and Cloud Computing Systems” – Prof. Domenico Talia


The analysis of the massive and distributed data repositories that are today available, require the combined use of smart data analysis techniques and scalable architectures to find and extract useful information from them. Parallel systems, Grids and Cloud computing platforms offer an effective support for addressing both the computational and data storage needs of Big Data mining and parallel analytics applications. In fact, complex data mining tasks involve data- and compute-intensive algorithms that require large storage facilities together with high performance processors to get results in suitable times. In this short course we introduce the most relevant topics and the main research issues in high performance data mining including parallel data mining strategies, distributed analysis techniques, knowledge Grids and Cloud data mining. We also present some data mining frameworks designed for developing distributed data analytics applications as workflows of services on Grids and Clouds. In these environment data sets, analysis tools, data mining algorithms and knowledge models are implemented as single services that are combined through a visual programming interface in distributed workflows. Some applications will be also discussed.

Syllabus. Parallel data mining techniques, distributed data mining, Grid-based knowledge discovery, Cloud-based data analytics workflows.


D. Talia, P. Trunfio, F. Marozzo, Data analysis in the cloud, Elsevier, 2015.
D. Talia, P. Trunfio, Service-oriented distributed knowledge discovery, Chapman and Hall/CRC, 2012
D. Talia, P. Trunfio, How Distributed Data Mining Tasks can Thrive as Knowledge Services. Communications of the ACM, vol. 53, n. 7, pp. 132-137, July 2010.
F. Marozzo, D. Talia, P. Trunfio, Scalable Script-based Data Analysis Workflows on Clouds. Proc. of the 8th Workshop on Workflows in Support of Large-Scale Science (WORKS 2013), Denver, CO, USA, pp. 124-133, ACM Press, November 2013.
F. Marozzo, D. Talia, P. Trunfio, “Using Clouds for Scalable Knowledge Discovery Applications. Euro-Par Workshops, Rhodes Island, Greece, Lecture Notes in Computer Science, vol. 7640, pp. 220-227, August 2012.
D. Talia, Parallelism in Knowledge Discovery Techniques. Sixth Int. Conference on Applied Parallel Computing, Helsinki, LNCS, vol. 2367, pp. 127-136, June 2002.

Date Corso:

Lunedì 14 dicembre, ore 10:00-13:00 – Aula Seminari DIMES
(Cubo 42C, V Piano)
Martedì 15 dicembre, ore 10:00-13:00 – Aula Seminari DIMES
(Cubo 42C, V Piano)
Mercoledì 16 dicembre, ore 10:00-13:00 – Aula Seminari DIMES
(Cubo 42C, V Piano)
Giovedì 17 dicembre, ore 10:00-13:00 – Aula Seminari DIMES
(Cubo 42C, V Piano)


Domenico Talia’s Bio:
Domenico Talia is a full professor of computer engineering and Chair of the ICT Center at the University of Calabria. He is a partner of two startups: Exeura and DtoK Lab. His research interests include scalable data analysis, cloud computing, parallel and distributed data mining algorithms, service computing, distributed knowledge discovery, peer-to-peer systems, mobile computing, and social data analytics.

Talia published eight books and more than 300 papers in archival journals such as CACM, Computer, IEEE TKDE, IEEE TSE, IEEE TSMC-B, IEEE Micro, ACM Computing Surveys, FGCS, Parallel Computing, IEEE Internet Computing and conference proceedings. He is a member of the editorial boards of IEEE Transactions on Cloud Computing, the Future Generation Computer Systems journal, the Scalable Computing: Practice and Experience journal, MultiAgent and Grid Systems: An International Journal, International Journal of Web and Grid Services, the International Journal of Biomedical Data Mining and the Web Intelligence and Agent Systems International journal. He has also been actively involved in a number of EC co-funded projects and served as a program chair or program committee member of several conferences.