Course overview
One of the key requirements of an applied statistician is the ability to formulate appropriate statistical models and then apply them to data in order to answer the questions of interest. Most often, such models can be seen as relating a response variable to one or more explanatory variables. For example, in a medical experiment we may seek to evaluate a new treatment by relating patient outcome to treatment received while allowing for background variables such as age, sex and disease severity. In this course, a rigorous discussion of the linear model is given and various extensions are developed. There is a strong practical emphasis and the statistical package R is used extensively.
Topics covered are: the linear model, least squares estimation, generalised least squares estimation, properties of estimators, the Gauss-Markov theorem; geometry of least squares, subspace formulation of linear models, orthogonal projections; regression models, factorial experiments, analysis of covariance and model formulae; regression diagnostics, residuals, influence diagnostics, transformations, Box-Cox models, model selection and model building strategies; logistic regression models; Poisson regression models.
Course learning outcomes
- Explain the mathematical basis of the general linear model and its extensions to multilevel models and logistic regression.
- Use the open source programming language R for the analysis of data arising from both observational studies and designed experiments.
- Explain the role of statistical modelling in discovering information, making predictions and decision making in a range of applications including medicine, engineering, science and social science.