course schedule

schedule, with links to slides and homeworks
Modified

March 8, 2026

The source code for these lectures is available at the github repository. Also please see the technical_notes for software and other troubleshooting tips.

Winter 2026

Week 1: Exploratory Data Analysis

Overview of the goals of the course: description, visualization, exploration, pattern discovery, and summarization. Introduction to different frameworks and goals, and relationship to preregistration and hypothesis testing. Types of data: tidy data, images, geospatial, words, time series.

Week 2: Visualization

Grammar of graphics. Overview of types of plot for uni- and multi-variate summarization, color pallettes, transformations. Output: bitmap, vector, and web-based interactive.

Week 3: Summarizing, smoothing, and outliers.

Split-apply-combine options. Types and goals of smoothers. Methods for outlier identification.

Weeks 4-5: Dimension reduction.

What low-dimensional representations do and what they don’t. Overview of methods: similarity- and distance-based; examples: principal component analysis, t-SNE.

Week 6: Working with words.

Bag of words, preprocessing, embeddings, latent Dirichlet allocation, other applications of dimension reduction. Finding n-grams, sentiment analysis.

Week 7: Case study.

Groundwater monitoring.

Week 8: Spatial data.

Spatial smoothing and prediction.

Week 9: Working with images.

Formats; layers; types of image data. Normalization and pre-processing. Applications of dimension reduction.

Week 10: More with images.

More with images; case study recap; student presentations.