Course overview
The Web and Internet Commerce provide extremely large datasets from which important information can be extracted by data mining. This course will cover practical algorithms for solving key problems in mining of massive datasets. It focuses on parallel algorithmic techniques that are used for large datasets in the area of cloud computing. Furthermore, stream processing algorithms for data streams that arrive constantly, page ranking algorithms for web search, and online advertisement systems are studied in detail.
Course learning outcomes
- To develop knowledge of algorithms for massive data sets and methodologies in the context of data mining.
- To gain experience in matching various algorithms for particular classes of problems.
- To gain experience in applying and developing algorithms as a part of software development for mining big data.
- Read and understand scientific research papers in the area of big data, critically evaluate research papers, and present them in a seminar talk.