Models are for transparency

5 min readMay 10, 2021

Part 1 of 6, Modeling for metascientists (and other interesting people).

http://dx.doi.org/10.13140/RG.2.2.11024.53767

I often hear the argument that theories in psychology are immature because they don’t make “risky” predictions (that is, predictions that are likely, given the theory, but unlikely, absent the theory). So, the argument goes, we need formal models to help generate precise predictions, such as a range of values or a smallest effect size of interest.¹

Fair enough.

But formal models have many uses besides prediction.² Off the top of his head, Joshua Epstein lists 16 reasons (other than prediction) to build models, including developing causal explanations, suggesting analogies, demonstrating tradeoffs, and revealing the apparently simple (complex) to be complex (simple).³

I’d like to focus on one reason that permeates all of these: making arguments more transparent.

Here’s how Epstein explains this:

The first question that arises frequently — sometimes innocently and sometimes not — is simply, “Why model?” Imagining a rhetorical (non-innocent) inquisitor, my favorite retort is, “You are a modeler.” Anyone who ventures a projection, or imagines how a social dynamic — an epidemic, war, or migration — would unfold is running some model.
But typically, it is an implicit model in which the assumptions are hidden, their internal consistency is untested, their logical consequences are unknown, and their relation to data is unknown. But, when you close your eyes and imagine an epidemic spreading, or any other social dynamic, you are running some model or other. It is just an implicit model that you haven’t written down…The choice, then, is not whether to build models; it’s whether to build explicit ones [emphasis added].

All of that stuff in our heads, all the arguments that we make using natural language — these are models. It’s just that they are implicit models, where the logic isn’t bulletproof, the assumptions aren’t clear, and everything is generally fuzzy.

In my experience, 10 times out of 11, the first thing that someone says when formalizing a verbal hypothesis is, “holy crap, I didn’t realize I needed to make so many assumptions!” [the 11th time is A) crying because they realize that the hypothesis is so vague, or B) having a panic attack].

Yes, the stuff in our heads is a jumbled mess, and the process of formalizing it using code or mathematics makes us realize the vast number of assumptions we need to make to actually produce a logically-coherent argument.

The thing is, many of us have had similar experiences in a different domain: trying to preregister our empirical studies.

We walk in with swagger, sit down with a fresh cup of coffee, open up the Open Science Framework tab and…What analyses are confirmatory vs exploratory? What type of randomization? How many stimuli? Why that sample size? How will you measure and analyze the DV? Which statistical models? What about missing data?

[Insert head-exploding emoji].

Similarly, imagine that you walk into my office one day— a fresh hypothesis in one hand and a day-old donut in the other.

Hey Leo, I’ve been thinking that rewarding novelty is bad for science because it incentivizes lower-quality research. Wanna try to model this together!?
Ah yea, cool alright. Wait, is that a donut? I never see you eat donuts.
Yup. It’s stale.
Why are you eating a stale donut?
Pandemic.
Makes sense…Ok, so, the model. Let’s say we have a population of scientists. How big is the population? And how long are scientists’ careers? Are there different types of scientists or do we just assume they’re all the same? Ok. Ok. So are the scientists interacting at random or will some scientists preferentially interact? What do we assume about who is more likely to interact with whom? Ok well, anyways, so they’re competing on research questions right? Do scientists know anything about how much progress their competitors have made? And the questions — are these worth the same amount, or do some questions give a higher payoff? Are they connected in any way or independent? And the payoffs. How do we make it so that novel results are worth more — we need to assume some decay function, maybe? Should the decay depend on the type of result or not?

[Insert 9 head-exploding emojis].

Notice how we haven’t mentioned prediction yet?

Building a model is an exercise in making an idea transparent. During this exercise, we run into points where we’re forced to make decisions that we never realized we needed to make.² ⁴ This is useful in and of itself, because just as with preregistration, it clarifies our thinking and makes our arguments more transparent.

And just as with preregistration, it can feel scary, because transparency makes you vulnerable.

You assumed that all scientists are the same? And that all research questions are independent?? And that there’s no publication bias??? My theory doesn’t assume any of those things! REJECT.

But the joke is not on you, the modeler, for being transparent. The joke is on them, the “theorist,” for having a theory that is so vague that it’s hard to know where to even begin with this modeling stuff because the verbal description leaves far too many degrees of freedom.⁵

So the next time that you come across a formal model as opposed to a natural-language description, keep this in mind. It will be easier for you to spot weaknesses in the formal model, precisely because it is more transparent.⁶ And you should absolutely, most definitely, point these out. But you should also give the authors a fist bump. Because had they not been transparent in the first place, you might have taken a bite of that donut too.

Lakens, Daniël, Anne M. Scheel, and Peder M. Isager. “Equivalence testing for psychological research: A tutorial.” Advances in Methods and Practices in Psychological Science 1.2 (2018): 259–269. https://doi.org/10.1177%2F2515245918770963
Fried, Eiko I. “Theories and models: What they are, what they are for, and what they are about.” Psychological Inquiry 31.4 (2020): 336–344. https://doi.org/10.1080/1047840X.2020.1854011
Epstein, Joshua M. “Why model?.” Journal of artificial societies and social simulation 11.4 (2008): 12. http://jasss.soc.surrey.ac.uk/11/4/12.html
van Rooij, Iris, and Giosuè Baggio. “Theory before the test: How to build high-verisimilitude explanatory theories in psychological science.” Perspectives on Psychological Science (2020): 1745691620970604. https://doi.org/10.1177%2F1745691620970604
Lee, Michael D., et al. “Robust modeling in cognitive science.” Computational Brain & Behavior 2.3 (2019): 141–153. https://doi.org/10.1007/s42113-019-00029-y
https://twitter.com/rlmcelreath/status/1328267803384836096

Edit 1, May 10 2021: “verbal hypothesis” changed to “verbal description.”

Models are for transparency

Written by Leo Tiokhin, PhD

No responses yet