Real Data: Modern Methods for Finding Hidden Patterns

Postgraduate | 2026

Course page banner
area/catalogue icon
Area/Catalogue
INFO 5046
Course ID icon
Course ID
203978
Level of study
Level of study
Postgraduate
Unit value icon
Unit value
6
Course level icon
Course level
5
Study abroad and student exchange icon
Inbound study abroad and exchange
Inbound study abroad and exchange
The fee you pay will depend on the number and type of courses you study.
No
University-wide elective icon
University-wide elective course
No
Single course enrollment
Single course enrolment
No
alt
Note:
Course data is interim and subject to change

Course overview

This course builds upon DATA 7201OL Data Taming, to introduce advanced modern techniques for extracting meaningful information from real-world, messy datasets. The course covers methods such as generalised linear models, classification, advanced regression techniques, and unsupervised statistical learning. A particular focus will be data wrangling techniques for non-standard, big, messy data: natural language processing, networks and longitudinal data. The course teaches advanced R programming techniques for data science.

Course learning outcomes

  • Create a predictive model for classification (predict classes) from real data using the TidyModels package in R
  • Create a predictive model for regression (predict numbers) from real data using the TidyModels package in R
  • Identify when predictive modelling is not giving accurate predictions due to overfitting
  • Apply cross-validation to avoid overfitting
  • Contrast the performance of prediction models to assess their viability
  • Analyse unsupervised data to find the patterns and represent the patterns visually
  • Communicate results of the interpretation and analysis of predictive modelling

Prerequisite(s)

N/A

Corequisite(s)

N/A

Antirequisite(s)

N/A