Friday, Mar. 31, 2023

Using Theory to Design Better Interactions with Visualized Data

by Jessica Hullman (Senior Law and Society Fellow (Fall 2022), Simons Institute)

Theories of sequential decision-making have been around for decades but continue to flourish. The Simons Institute’s Fall 2022 program on Data-Driven Decision Processes provided an excellent overview of recent results in online learning and sequential decision-making. Visiting the Simons Institute as a researcher whose background is in interactive data visualization, I spent some time thinking about how learning theory might advance more applied research related to human-data interaction.

First, it’s worth noting that theories of inference and decision-making remain relatively unintegrated in fields that research data interfaces, including human-computer interaction and visualization. While we might sometimes visualize data simply to generate a historical record, such as to compare points scored across NBA players, most of the time visualization is used to support inference tasks like extrapolating beyond the particular sample to the larger population. Yet beyond a smattering of experimental papers that make use of decision theory, only a handful of works have advocated for theorizing the function of visualizations in the context of frameworks that could provide prescriptive guidance (for example, within the traditions of model checking,¹ Bayesian cognition,² and hypothesis testing³).

A natural question is why. I suspect it may have something to do with the status of visualization as a general-purpose tool that can be put to work to summarize data, persuade viewers to draw certain conclusions from data, or support inferential and decision tasks, sometimes simultaneously. More formal theory requires pinning down the problem more precisely, which might seem reductive. Visualization has also long been associated with the tradition of exploratory data analysis in statistics, in which John Tukey pioneered exposure of the unexpected through graphical displays as an overlooked part of statistical practice.⁴ Maybe the understanding that visualization is valuable for producing unpredictable insights keeps researchers away from attempting to theorize.

The power of visualization is often touted as providing an external representation that enables the amplification of cognition by allowing natural perceptual processes to aid in identifying patterns in data and freeing up valuable working memory. A key part of this is that a good visualization enables its human user to bring their prior domain knowledge to bear. Similar to how some statistical modelers shy away from the idea of formalizing prior knowledge in applied Bayesian statistics, the role of prior knowledge in visualization-aided analysis may contribute to a seeming bias in the literature toward leaving the human part of the equation untheorized. Instead, we’re left to trust that in supporting exploratory analysis in its various forms, visualization interactions don’t need modeling because the analyst will “know when they see it” and also know what to do about it, whether that means transforming data, collecting more data, making a decision, etc.

All this means that there are many opportunities where statistical theory, data economics, and online learning theory could be helpful for providing a more rigorous theoretical framework in which to answer questions that get at the heart of what visualization is about.

Implications of Aggregation Choices
Designing defaults for presenting distributional information is one interfaces problem where theory could be brought to bear. One example is the choice of how to aggregate data in an exploratory visual analysis tool. I recall visualization researcher Tamara Munzner once remarking that if we knew the best way to summarize data, we would have “solved” visualization. As evidence that we have not, different visual analysis tools, like Tableau Software, as opposed to visualization recommender systems produced by researchers, default to different aggregation strategies. Some tools aggregate data by default, as this better scales to very large data sets and can enhance identification of patterns, while others default to plotting raw data, such that variation is easier to incorporate into judgments. Intermediate representations are sometimes used in other contexts, like error bars representing inferential or predictive uncertainty for experimental data.

The standard way to resolve such debates about the better representational choice in modern visualization or HCI research would be to conduct a human-subjects experiment. But because of limits on how many scenarios a single experiment can study, any empirical experiment of different visualizations inevitably isolates a few tasks, as well as a few data sets, and gathers a convenient sample of participants. We should generally expect the experimenter to choose a set of conditions that they think will help them demonstrate the effects — e.g., visualization performance differences — that align with whatever preconceived notions motivate their hypotheses. This is not necessarily wrong: after all, why test a new drug on people on their deathbed, or those who are perfectly healthy? But the implication is that estimated effects from empirical experiments in visualization, like estimated effects from controlled experiments in many other fields, tend to be overfit to the particular conditions tested.

A theoretical approach could make it possible to gain insight across a larger range of conditions we think are plausible scenarios for visual analysis. Information economics formalizes the notion of signal generation. Different visual aggregation approaches represent different information structures in the economic sense of “containers” for incomplete information relevant to some decision problem. Information design considers the extent to which the right information structure can influence the behavior of agents tasked with reasoning under uncertainty.⁵ Taking inspiration from Blackwell ordering,⁶ which defines conditions under which information structures can be ranked based on their expected performance across classes of decision tasks, we can analyze the expected performance of different visual aggregation strategies under different conditions. For example, we can define spaces of plausible data-generating processes, and target judgments or decisions under uncertainty (e.g., allocating a budget across units, deciding whether to pay for a risky treatment, estimating effect size). We can also define a set of scoring rules, specifying the reward (or stakes) associated with providing a better versus a worse response. By comparing expected performance across different situations, we can make more specific predictions about where questions of aggregation strategy are likely to matter more. This is not to say that human-subjects experiments are not useful or needed. Well-designed empirical experiments can, for example, help us identify the kinds of heuristics that behavioral agents may rely on when faced with a decision under uncertainty. Given a space of visualization decision problems, we can simulate the expected performance of visual aggregation strategies under these heuristics and compare them with our expectations of rational agents, using human-subjects experiments to check and refine our expectations.

Modeling Sequential Decisions in Visual Analysis
We could also go further toward using theoretical approaches to inform system design by identifying suitable frameworks for representing what a human analyst does when they react to visualized data during some larger process of inference. The nature of data-dependent decision-making that characterizes a visual analysis session may not always be captured by a single decision problem. How do we model the exploratory data analysis (EDA) process that visualization systems aim to support?

Online learning theory provides a variety of formulations of sequential decision-making, like multi-armed bandits, reinforcement learning, partially observable Markov decision processes (POMDPs), etc., along with algorithmic performance characterizations. The optimal balance between exploration and exploitation that algorithms for sequential decisions aim to achieve is a good match for what we know about EDA in practice, where analysts often blend breadth-first and depth-first patterns⁷ (e.g., alternately clicking through plots of different independent variables to gauge their impact on an outcome versus progressively refining a plot of a single set of independent variables against the outcome). But what kind of problem formulation best matches the central tasks in EDA?

Theories of exploratory analysis suggest that analysts go through different phases when graphically exploring data,⁸ from considering abstractions (defining the types of variables to identify relevant plot types, aggregations, and models), formulating expectations, and identifying problems with data, to identifying and informally checking provisional statistical models to better understand relationships. Some phases are probabilistic, such as when the analyst intuitively estimates parameter values and their uncertainty, but others may not be, particularly early discovery-oriented phases,⁹ where an analyst simply judges which theories the data at hand might support. Hence, it seems unlikely that a single problem formulation for sequential decision-making will be rich enough to capture the larger process of visual analysis, but parts of the process are amenable to formal representation.

For example, it makes sense to think that most visual analyses begin with some high-level query about some true state of things, such as, what drives hotel room bookings? Or, what’s the relationship between lending patterns and demographics in some region? The analyst first engages in pattern detection, followed by parameter estimation. These assumptions might be captured by a partially observable Markov decision process. In this formulation, we can think about the “reward” that an analyst gets from any set of observables (i.e., a visualization they generate) as the amount of information they learn from that visualization, defined relative to some prior beliefs. At each step, the reward gleaned from viewing a visualization affects the probability with which the analyst moves to alternative views. Their goal is to find a sequence of actions that maximizes their rewards. Given a model of the visualization state-space and plausible next states, which visualization research can inform, we might consider the performance of an optimal algorithm for different forms of questions and states of a priori knowledge, and use this as a benchmark against which to compare human performance. This approach naturally lends itself to asking questions about how to design the visual representation and interactive mechanisms of a visual analysis tool to better encourage optimal behavior.

Formalizing the Value of a More Informative Visualization
Another way we can supplement empirical experiments with more formal analyses is toward understanding how much has been learned from a visualization study. Assume an experimenter presents us with an estimated performance difference between visual representations from a human-subjects experiment they ran. Such effects are often small, but judging how small is too small to matter can be hard, since the observed difference is specific to the artificial world induced by the experiment. For example, how much does a 4 or 5 percentage-point accuracy advantage in someone’s ability to estimate effect size as the probability that one random draw from one distribution has a higher value than another from a particular visualization relative to another matter? Because experimenters are likely to choose conditions to study that emphasize an a priori hypothesis they have, it makes sense to ask: How important is this observed difference?

One question we might care about is, how big a difference might we expect between visualizations under ideal circumstances for the decision problem at hand? Here, an informational perspective can be helpful for reflecting on how much the question of how to visualize data matters over the bigger question of what information to show. To provide an upper bound on the performance that we could expect from a behavioral agent using a particular visualization in an experiment, we can calculate the expected score (under whatever scoring rule the experiment uses) of a rational agent who is knowledgeable about the process that produces the experimental stimuli and who processes the visualization perfectly. When visualizations represent different information structures, comparing the expected scores of a rational agent using each type gives us information about how much difference we would expect purely from the informational differences. When the scoring rule is bounded, we can ask whether the study was “dead in the water” to begin with, by looking at how large the distance is, relatively speaking, in score space, between having only the prior versus also having access to the visualization. If having access to the visualization offers the rational agent relatively little advantage compared with having access to the prior, we probably shouldn’t expect to see very large differences between visualizations in our behavioral experiment unless one of the visualizations is severely misinterpreted. We are exploring these sorts of analyses in ongoing work¹⁰ as a way to help contextualize experimental designs and results that would otherwise remain opaque.

These are just a couple of directions among many that could lead to deeper insight into interface trade-offs as well as new real-world relevant problems for theorists interested in the value of information. I hope the coming years will see more collaborations between theorists and interfaces researchers!

Hullman and Gelman, Designing for Interactive Exploratory Data Analysis Requires Theories of Graphical Inference (2021). ↩︎
Kim, Walls, Krafft, Hullman, A Bayesian Cognition Approach to Improve Data Visualization (2019). ↩︎
Wickham, Cook, Hofmann, Buja, Graphical inference for infovis (2010). ↩︎
Tukey, Exploratory Data Analysis (1977). ↩︎
Bergemann and Morris, Information Design: A Unified Perspective (2019). ↩︎
Blackwell, Equivalent Comparisons of Experiments (1953). ↩︎
Battle and Heer, Characterizing Exploratory Visual Analysis: A Literature Review and Evaluation of Analytic Provenance in Tableau (2019). ↩︎
Cook, Reid, Tanaka, The Foundation Is Available for Thinking About Data Visualization Inferentially (2021). ↩︎
Oberauer and Lewandowsky, Addressing the theory crisis in psychology (2019). ↩︎
Wu, Guo, Mamakos, Hartline, Hullman, The rational agent benchmark for data visualization (in progress). ↩︎