+ - 0:00:00
Notes for current slide
Notes for next slide

DSBA 5122: Visual Analytics

Class 5: Distributions and Uncertainty

Ryan Wesslen

September 23, 2019

1 / 35

Why view Distributions? Cairo Ch. 7 & Wilke Ch. 7 - 9

2 / 35

3 / 35

4 / 35

5 / 35

6 / 35

7 / 35

8 / 35

9 / 35

10 / 35

11 / 35

12 / 35
ggplot(df, aes(x, y)) + geom_boxplot()
ggplot(df, aes(x, y)) + geom_violin()

13 / 35

14 / 35

15 / 35

16 / 35

Uncertainty: Cairo Ch. 10 & Wilke Ch. 16

xkcd

19 / 35

20 / 35

21 / 35

22 / 35

23 / 35

Bootstrapping: Within ggplot2

organdata %>%
ggplot(aes(x = country, y = donors)) +
{{stat_summary(fun.y = mean, geom = "point", size = 3) +}}
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.5) +
coord_flip() +
cowplot::theme_cowplot() +
labs(x = " ",
y = "Organ Donations in 000's",
title = "Avg Organ Donations (000s) by Country")

https://rstudio.cloud/spaces/22733/project/527500

24 / 35

Bootstrapping: Outside ggplot2

organdata %>%
group_by(country) %>%
{{do(as_tibble(bind_rows(Hmisc::smean.cl.boot(.$donors)))) %>%}} # bootstrapping by country
ggplot(aes(x = reorder(country, Mean), y = Mean)) +
{{geom_point(size = 3) + }} # Plot means as geom_point
geom_errorbar(aes(ymin = Lower, ymax = Upper), width = 0.5) + # Error as geom_errorbar
coord_flip() +
cowplot::theme_cowplot() +
labs(x = " ", y = "Organ Donations in 000's", title = "Avg Organ Donations (000s) by Country")

https://rstudio.cloud/spaces/22733/project/527500

25 / 35

Bootstrapping with HOPs + gganimate

ungeviz package by Claus Wilke

26 / 35

Unemployment Rate

df %>%
ggplot(aes(x = date, y = unemployment)) +
geom_line() +
coord_cartesian(ylim = c(0, .11), expand = FALSE),
scale_y_continuous(labels = scales::percent) +
labs(x = NULL, y = NULL, subtitle = "US unemployment over time")

27 / 35

Unemployment Rate

Kay and Hullman Multiple Views Blog 1

28 / 35

Types of Uncertainty: Reducible and Irreducible

Kay and Hullman Multiple Views Blog 1

29 / 35

Unemployment Rate

Source: Matthew Kay

30 / 35

Could use a "predictive bar" for the most likely path (draw) and uncertainty around it... fixes the reader to whatever arbitrary interval (95%) the visualization designer chose to display.

Unemployment Rate

Source: Matthew Kay

31 / 35

By showing multiple intervals, we can distinguish between different intervals of uncertainty...

Unemployment Rate

Source: Matthew Kay

32 / 35

Unemployment Rate

Source: Matthew Kay

33 / 35

Why is visualizing uncertainty hard?

  • Efficient encodings for uncertainty can be hard to find.

  • Make sure people understand encodings (what does the plot mean?).

  • Perceptual models of probability (e.g., quantile dot plot, HOP).

  • Decisions under uncertainty (e.g., Gigerenzer et al or Monty Hall problem).

  • Findings may not apply in all contexts.

  • Plus, you still have to actually build it!

Matthew Kay

35 / 35

Why view Distributions? Cairo Ch. 7 & Wilke Ch. 7 - 9

2 / 35
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow