Course Introduction
Advanced Statistics and Data Analysis
Class organization
- 2 hrs of lecture in the morning (9:30 or 11:30)
- 4 hrs of labs in the afternoon (14:30)
Office hours to be agreed over email.
My research interests
- Statistical methodology: modeling of high-dimensional data, parametric and non-parametric regression, clus- tering, factor analysis and dimensionality reduction, multiple hypothesis testing, combining data from multiple sources (meta-analysis).
- Omics data analysis: design and analysis of high-throughput gene expression experiments, analysis of single-cell and spatial transcriptomics.
- Statistical software development: author of >10 Bioconductor packages, member of the Bioconductor Technical Advisory Board.
Program outline
Statistical topics
- (Everything is a) linear model
- Experimental deisgn
- The generalized linear model
- Likelihood estimation and inference
- Nonparametric methods: permutations and the bootstrap
- Multivariate analysis: PCA and more
- Advanced statistical models: random effects, Bayesian models, discrete states, …
Program outline
R topics
- How to make engaging and informative plots
- How to effectively “wrangle” data
- How to make your analysis reproducible
- Statistical modelling in practice
Suggested materials
There is no textbook.
Lecture slides and the companion website are the main resources.
In specific lectures I will give you additional materials, including suggested readings from books, journal articles, and more.
If you are looking for a comprehensive book that covers most of the topic taught in this course, check out Modern Statistics for Modern Biology
Exam
- The exam will be in the computer lab, with R
- It will consist of both practical matters (data exploration and analysis) as well as theoretical questions
- The structure of the exam will be similar to the lab sessions