Corso di Dottorato “Scalable Techniques for Big Data Analysis” Proff. Domenico Talia, Fabrizio Marozzo, inizio 08.01.2018

Corso di Dottorato “Scalable Techniques for Big Data Analysis”

Docenti

Domenico Talia, Fabrizio Marozzo

Programma (sintetico)

The analysis of the massive and distributed data repositories that are today available, require the combined use of smart data analysis techniques and scalable architectures to find and extract useful information from them. HPC and Cloud computing platforms offer an effective support for addressing both the computational and data storage needs of Big Data mining and parallel analytics applications. In fact, complex data mining tasks involve data- and compute-intensive algorithms that require large storage facilities together with high performance processors to get results in suitable times. In this short course we introduce the most relevant topics and the main research issues in high performance data mining including parallel data mining strategies, distributed analysis techniques, knowledge Grids and Cloud data mining. We also present some data mining frameworks designed for developing distributed data analytics applications as workflows of services on HPC and Clouds. In these environment data sets, analysis tools, data mining algorithms and knowledge models are implemented as single services that are combined through a visual programming interface in distributed workflows. Some applications will be also discussed.

Syllabus. Parallel data mining techniques, Big data analysis, distributed data mining, Cloud-based data analytics workflows.

References

  • Talia, P. Trunfio, F. Marozzo, Data analysis in the cloud, Elsevier, 2015.
  • Talia, P. Trunfio, Service-oriented distributed knowledge discovery, Chapman and Hall/CRC, 2012
  • Talia, P. Trunfio, How Distributed Data Mining Tasks can Thrive as Knowledge Services. Communications of the ACM, vol. 53, n. 7, pp. 132-137, July 2010.
  • Marozzo, D. Talia, P. Trunfio, Scalable Script-based Data Analysis Workflows on Clouds. Proc. of the 8th Workshop on Workflows in Support of Large-Scale Science (WORKS 2013), Denver, CO, USA, pp. 124-133, ACM Press, November 2013.
  • Marozzo, D. Talia, P. Trunfio, “Using Clouds for Scalable Knowledge Discovery Applications. Euro-Par Workshops, Rhodes Island, Greece, Lecture Notes in Computer Science, vol. 7640, pp. 220-227, August 2012.
  • Talia, Parallelism in Knowledge Discovery Techniques. Sixth Int. Conference on Applied Parallel Computing, Helsinki, LNCS, vol. 2367, pp. 127-136, June 2002.
Orari
– lunedi        8 gen, ore 10-13 – Aula Seminari DIMES
– martedi      9 gen, ore 10-13  – Aula Seminari DIMES
– mercoledi  10 gen, ore 10-13  – Aula Seminari DIMES
– giovedi      11 gen, ore 10-13  – Aula Seminari DIMES