Data Mining algorithms and computational paradigms allow computers to find patterns and regularities in databases, perform prediction and forecasting, and generally improve their performance through interaction with data. It is currently regarded as the key element of a more general process called knowledge discovery that deals with extracting useful knowledge from raw data. The knowledge discovery process includes data selection, cleaning, coding, using different statistical and machine learning techniques, and visualization of the generated structures. Special emphasis will be given to the Machine Learning methods as they provide the real knowledge discovery tools. Important related technologies such as data warehousing and on-line analytical processing (OLAP) will be also discussed. This course introduces the overview of Big Data Analytics that include applications, market trends, the fundamental platforms, such as Hadoop, Spark, and other tools such as Linked Big Data. The course will introduce several data storage methods and how to upload, distribute, and process them. This will include HDFS, HBase, KV stores, document database, and graph database.
This course intends to make students to
On completion of this course, students should be able to
TextBooks
References
Evaluation | Marks | Percentage |
---|---|---|
Class Participation | 10 Marks | 10% |
Tutorial/Project/Assignments/Discussion/Presentation | 30 Marks | 30% |
Final Examination | 60 Marks | 60% |