A Bayesian network is traditionally defined as a directed acyclic graph (DAG), made up of nodes and arrows, coupled with discrete conditional probability tables (CPTs). If an arrow starts at one node and points to another, the first node is called the parent, and the second the child — but only for that arrow. Each node represents a random variable (that is, a variable that can be in any of its states with some probability), and the CPT for that node gives that variable’s probability distribution *given* any combination of states you may choose for the parents. Here’s a simple example DAG for a Bayesian network:

And here is an example CPT from that network:

Read the above table left to right. If **Food_is_Delicious** is **yes**, then there is a .95 chance that **Happy** is **yes** (and .05 chance that **Happy** is **no**). If **Food_is_Delicious** is **no**, then there is a .3 chance that **Happy** is **yes** (and .7 chance that **Happy** is **no**). In simpler terms, if the **Food is Delicious**, then you have a 95% chance of being **Happy**; if **not**, then you only have a 30% chance of being **Happy**.*

This network is also *causal*. That means that each arc *should* be interpreted as saying that the parent causes the child. And what does “causes” mean? Keeping things as simple as possible, it means that if in the real world you could somehow manipulate the parent, the probability distribution over the child will change. Even if it seems like it would require magic to change the parent, that’s OK — but the change in the probability distribution over the child must *not* be magic. The magic needed to change a parent is called an *intervention* and we typically say we *intervene on *a node to see the causal effects. In this case, intervening on the **Deliciousness of the Food** would then (non-magically) change the chances of being **Full** and **Happy**. Intervening on **Happiness** wouldn’t lead to any change in the **Deliciousness of the Food**.

Bayesian networks don’t have to be causal. Here’s an example which isn’t:

If we interpreted this as causal, it would suggest that intervening on the **Deliciousness of the Food** would somehow lead to a **Good Cook** who made the food in the first place. That, of course, doesn’t make any sense. Nonetheless, some of the arrows still *happen* to be causal — those are the ones that haven’t changed from the first network, namely the arrows that point from **Food is Delicious** to **Full** and **Happy**. That’s not intentional.

This can get tricky, but generally, a Bayesian network is called causal if whoever uses it *intends* to use it causally. I’m not using this second one causally. If I were, then it *would* be a causal Bayesian network with half of the arrows incorrect. That may or may not be OK (just like any error), depending on what you’re doing. In fact, it’s also totally OK to have a partly causal Bayesian network — just put a note on the arrows that aren’t causal, so that you know how to step over them when you’re trying to work out the causal effects — like if you’re trying to work out if intervening on **Made with Love** can lead to you being **Full**, or vice versa.

* I’ll use **bold** to refer to a node/variable or one of its states. In text descriptions of Bayesian networks, I will freely switch between any functionally equivalent version of the variable name, such as **Food is Delicious** and **Deliciousness of the Food**, **no** and **not**, etc.