Graphics and visualization

Author

Peter Ralph

Published

January 12, 2026

Visualization

Goals

  • pattern discovery

  • efficient summary of information

  • visual/spatial analogy for quantitative patterns

Aim to maximize information and minimize ink.

paraphrased from Edward Tufte

Considerations

  • Is the visual analogy appropriate for the type of data?

counts? quantities? multivariate? relationships?

  • Are important comparisons clear?

between groups? differences? time trend?

  • Are units easily interpretable?

meters? dollars? percent? relative change? is it isometric?

Principles of effective display

  • Show the data

  • Encourage the eye to compare differences

  • Represent magnitudes honestly and accurately

  • Draw graphical elements clearly, minimizing clutter

  • Make displays easy to interpret

Above all else show the data.

Tufte 1983

honeybees

Think about what you want to communicate

cumulative COVID test numbers

Broman’s bad graphs 1

from Roeder K (1994), Statistical Science 9:222-278, Figure 4 via Karl Broman

Deconstructing the graphics

How is information conveyed in this chart?

A plot with three pairs of points, showing among “all public”, “white public”, and “black public” the “percent who say that the deaths of blacks during encounters with police in recent years are signs of broader problem”

From a 2016 survey by the Pew Research Center, via flowingdata.

  • percentage values are mapped to vertical coordinate
  • columns are groups of people
  • color of points shows “public” versus “police”
  • lines connect ‘public’ and ‘police’ pair in a given group
  • length of lines connecting points shows difference between public and police percentages
  • labeled ticks on y-axis shows what the values are
  • y-axis scale is chosen so vertical distance is proportional to percentage difference

Question: Pros/cons of this plot versus table with 6 numbers?

How about this chart?

A map with counties colored to reflect the median household income, which ranges from about $20K to about $125K.

From the 5-year American Community Survey 2013, via flowingdata

Mosaicplots of gainful occupations,

from flowingdata again

A set of 50 mosaicplots showing what people are employed in, in each state

A plot of mean global temperature from 1880 to 2018

from NYT

A set of 50 histograms(?) showing age distribution by sex in each state.

From: flowingdata

Florence Nightingale’s coxcomb plot of mortality causes

Output formats

Options:

  • bitmap (e.g., png)
  • vector (e.g., svg, pdf)
  • html (includes the others)
  • interactive? (e.g., bokeh, plotly, shiny, d3)
  • dashboard?

How does interactivity work?

tdlr; usually in the web browser, with javascript

For example:

Considerations

  • How much time do you want to spend making the plot?
  • Who will see the plot, and what is their background?
  • How much time will they spend looking at the plot?
  • How will the plot be distributed?
  • What do you want the plot to communicate?

For instance:

  • Quick viz plot for me: what’s this look like?
  • In-depth viz plots for mostly me: show me the data.
  • Punchy plot for a report: here is the main point.
  • Beautiful multi-layered plot for data nerds: the data are telling their own story.
  • Dashboard: I am the commander of the Starship Enterprise.

In this class

Generally, exploratory data analysis is quick: we try out many things, and narrow in on useful visualizations.

So: simple, easy plots. We’ll be using plotnine, an implementation of the Grammar of Graphics, that encourages us to abstract the idea of visuallly representing data and makes it easy to incrementally develop plots.