Course overview
The course aims to provide students with comprehensive skills for managing and transforming real-world data into actionable insights. Students will learn to clean and preprocess diverse data sets, employ exploratory data analysis to validate data manipulations, and apply statistical machine learning techniques to develop predictive models. Emphasis will be placed on practical, hands-on experience with coding in Python, data manipulation, and model evaluation. By integrating these skills, students will be prepared to tackle complex data challenges and make informed decisions based on robust analytical methods. The course will also focus on improving students' ability to communicate technical findings effectively through visualisations, written reports, and presentations tailored to both technical and non-technical audiences. By the end of the course, students will be equipped to contribute to data-driven decision-making in a professional setting.
Course learning outcomes
- Apply Python programming skills to manage, manipulate, and preprocess data from various sources, including tasks such as data cleaning, transformation, and integration
- Utilise exploratory data analysis (EDA) techniques and visualisation tools to uncover patterns, outliers, and relationships within datasets, guiding further model development.
- Develop and evaluate predictive and classification models such as linear regression, logistic regression, decision trees, and k-Nearest Neighbors (kNN), applying cross-validation and other evaluation techniques to measure model performance
- Apply multivariate analysis techniques, including Principal Component Analysis (PCA) and clustering methods, to reduce dimensionality, identify patterns, and group similar data points for enhanced insights
- Design and implement a comprehensive data analysis workflow, including EDA, model building, and validation, to solve real-world problems and effectively communicate results through Python Notebooks and visualisations