A Bayes by any other name

Probably not Thomas Bayes

It can sometimes be hard to understand and remember the terms that statisticians use. This is understandable: statistics works with a lot of abstract concepts. But it helps when the terms are descriptive. For example, we have random variable: a value that varies depending on random events. We have estimator: a procedure or set of rules for making a particular kind of statistical estimate. And we have expected value: (loosely) the value that is closest on average to the values we would expect to see from a random variable. While technical, the terms are easy enough for anyone to understand and remember.

But sometimes terms are eponyms — meaning, they’re named after someone, usually the discoverer. If you encounter one of these for the first time and you don’t know anything about its discoverer, you’ll have three things to remember: the person’s name, the concept itself, and the fact that the person’s name refers to the concept. This is OK for people in the thick of the work — they’ll encounter it so much, it’ll become second nature.

Unfortunately, some terms that would be really useful in everyday life are also eponyms. Like “Bayesian”. This is a problem. In my experience, if you’re non-technical or don’t work with statistics, there’s almost no chance you’ll know what Bayesian refers to. Even with an explanation, it’s tricky, because there’s at least three things to remember. For example, for the simple “Bayes’ rule”, there is 1) the name “Bayes” (who few, if any, really know), 2) there’s a rule that lets you calculate the probability in the reverse direction to the probability you know, and 3) Bayes’ rule (or Bayes’ theorem) refers to this rule. The term “Bayesian” is even more difficult, as it refers to all the things that build on top of Bayes’ rule.

It would make things much simpler if we used “inverse” instead — as in, “inverse probability” — which is very descriptive and is what Bayesian probability used to be called.1The terms frequentist and Bayesian may (or may not) be due to Ronald Fisher. It would be fitting if true, since Ronald Fisher advocated frequentist over Bayesian reasoning, and the eponym likely helped slow its adoption.

Things are even more unfortunate for Bayesian networks — they rely on but rarely make direct use of Bayes’ rule, and don’t require any commitment to Bayesianism. You can even calculate probabilities in the forward direction without even relying on Bayes’ rule in theory; only if you need to calculate inverse (or mixed) probabilities. This is why Bayesian networks are often called by more descriptive and intuitive names: belief networks, Bayesian belief networks, or (more generally) probabilistic graphical models.

However, I’ve come to embrace Bayesian (and Bayesian networks) as an eponym worth keeping, despite the blank stares it may cause amongst those unfamiliar. It has come to denote a critical, unique and complete way of reasoning about the world that really can’t be pinned to any other existing word. I think that’s reason enough to have a name that’s all its own.

Leave a Reply

Your email address will not be published. Required fields are marked *