Scroll down to book a 30 minute diagnostic call for free

33 Ubi Avenue 3, #08-13 The Vertex, Singapore 408868

Cause & Effect: Using maths plus data to push sales with marketing

Our brains are very good causality engines, unlike animal brains or artificial intelligence (AI). Our frontal cortex seems to be built to imagine different scenarios so that we can simulate outcomes of, say, hunting a mammoth, in our heads and choose the best option (and avoid dying!). It gives the ability to imagine societies, software solutions, and marketing strategies or campaigns. In the international bestseller, Sapiens, Yuval Noah Harari says that imagination is the foundation of our ability to simulate.

But my question is:

How do we use our imagination to simulate and compare different marketing scenarios well?

The problem is that causality is complicated to model

Causality is really important in marketing - our job as marketers is to sell stuff - it is extremely helpful to be able to say "if I do this, then sales go up". But cause and effect are mathematically complicated to model. In data-driven marketing, which uses statistics, we talk about "leading indicators", which is code for "if I see a change in data about this, then I am likely to see a change in data about that (which matters to me) after about this much time has passed." Freakanomics is another best selling book that describes some of the more entertaining connections we've found. Most big data analyses use statistics, which basically put your data through a series of algorithms to look for patterns. Statistics deliberately avoid cause and effect in a bid to be more objective, but this is a fraught history because our brains keep adding it in - hence the famous phrase "correlation is not causation". Except that some of the time it is.

Here is a more tangible form of correlation vs causation: "if weather forecast data tells me that the next 10 days will be sunny and hot with no rain, my ice cream sales at the park will go up at the weekend". Our brains can see that the sunny day is causing the ice cream sales to go up, but until recently we've had no way to model this properly. Our mathematics has been direction agnostic, which means that we can represent the link between the sunny day and the ice cream sales (correlation), but not that the sunny day causes the ice cream sales (causation). Our models would show that it is equally likely that the ice cream sales are causing a sunny day - which your brain can tell you is not true because this is a familiar situation.

We have struggled to teach AI to think like a human because we have lacked the ability to model causal direction. The closest we've got is a sophisticated form of subjective probability called inverse conditional probability (if I see this evidence, what is the likelihood that this event preceded it), represented by Bayesian networks. In truth, we need a mathematical definition of causality to form the basis of the computer programmes needed to develop intelligence - or that ability to simulate causal scenarios to test the outcome. Until recently, we didn't have this mathematics, but we do now - it's called causal inference mathematics. It uses elegant, simple, directional path diagrams called causal diagrams (dots representing events and arrows from cause to effect). It succeeds where statistics fails because it has clever ways of allowing subjectivity into the maths (just like Bayesian Networks do).

How good are different approaches to marketing at causal thinking?

There are three levels of causal thinking according to Judea Pearl, a computer scientist and causal inference mathematician who has been trying to create strong AI (AI that thinks like a human) for 25 years, and Dana Mackenzie. This diagram from their book explains the different levels, derived from trying to teach AI to think like an adult human.

No alt text provided for this image

Source: The Science of Why by Judea Pearl and Dana Mackenzie

Level 1 Seeing: In the marketing world, this would be data-driven marketing and AI-driven big data insights for marketing. You are asking yourself "if toothpaste sales go up, do floss sales also go up?" - you observe.

Level 2 Doing: In the marketing world, this would be growth marketing and performance marketing. You are asking yourself "if I add flavours of toothpaste, do sales of toothpaste go up?" - you act and see what changes.

Level 3 Imagining: In the marketing world, this would be scientific marketing. You are asking yourself "The people who buy toothpaste also by floss. I can double the price of toothpaste and they will still buy floss making my revenues increase. What evidence would I expect to see if my floss theory is true and what evidence would I expect to see if it is false?" - you investigate what happens if you act and what happens if you don't act through simulation and experiment.

Data-driven marketing is way better than the old trial and error model, but as marketers responsible for driving sales opportunities, we need to do better. We need to map out cause and effect before collecting data, in order to ensure that the data is telling us what we think it is and that what it is telling us is more than a data blip.

Data-driven marketing does not find the cause of sales uplift

Data by itself tells you correlation (if I see this then I am likely to also see that) NOT causation (this causes that).

When we do data-driven marketing what we confirm is actually correlation, or the probability of two events occurring together. Here is an example: if I use this image in this social media post then higher engagement is also likely to occur. But could it be the topic? or the timing? or something else I am unaware of? In a live commercial situation, we often don't have the opportunity to cut out the other variables as we would in lab conditions. We try to make the best of it, by doing bigger longer experiments to control for temporary phenomena like timing - but it's not really the same thing as identifying the cause.

Understand cause and effect like a mathematician

Causal inference mathematics has the answer. Don't worry if you're bad at maths, it's mostly drawing diagrams, not doing algebra (though you can do algebra if that's your cup of tea). The diagrams map out the patterns of cause and effect between events using dots and arrows. You do this subjectively, using a few experts and your own previous knowledge or market research. Then you use the diagram to simulate what happens if you remove the connection between a cause and an effect to see if you would get a different outcome (you try to confound the cause and effect pattern). From that, you can decide what kind of data you need to collect to verify the key counterfactuals (facts that show what isn't the case) in your simulation.

You need a causal engine to process the data

The flow diagram below is a causal engine from causal inference mathematics. It is a way of processing cause and effect information to make sure that you figure out the correct cause. Basically, it shows that there is thinking to be done before you get to data:

  • You need to create a model of cause and effect with dots and arrows (causal model, box 3) that you can test by exploring how you can break it.
  • You also need to make sure that your query or hypothesis is actually answerable - which means you can find evidence that proves the causal connection is there (true) or not there (false).

This is important - it tells you the kind of data you need and the story or pattern in the data you should expect if your hypothesis is true or if it is false. This is better than just looking for stuff that occurs together.

No alt text provided for this image

Source: The Science of Why by Judea Pearl and Dana Mackenzie

Our brains are good at identifying cause and effect at super speed for known situations, remember adult humans are on Level 3. So it is obvious to an adult human that an uplift in ice cream sales is caused by a hot day. Where our brains start to let us down is in unknown situations. Why? Because brains are also good at seeing abstract patterns and turning them into generalised heuristics, and so they tend to fill in the gaps. Where data is most valuable is in illuminating those places where we have used the heuristics in error, where we have over generalised. This happens most in unfamiliar situations. So we need a way to consider the cause and effect relationships in a disciplined way, to counteract the need to make the data fit our preconceived patterns. Although our brains are good at identifying cause and effect, AI is not. If we look at how we now teach AI cause and effect, through casual diagrams, we can find that disciplined way of processing cause and effect in marketing data.

A casual diagram is a way of representing cause and effect relationships with a simple arrow and dot graph that anyone can draw. This helps us to zero in on the data that will answer our query and the trends or counterfactuals in the data that will prove our point either way.

Here is a causal diagram I have drawn for our example: if the temperature for teh next 10 days is higher than average than ice cream sales go up (a causal relationship our brain knows intuitively). I've used the special case of tropical Singapore because it has aspects that don't work in a causal diagram of the same relationship in, say, the UK, which makes it less familiar. This forces us to think about the situation properly, and so challenges our assumptions - demonstrating why causal diagrams are useful.

No alt text provided for this image

Diagram: A simple causal diagram of a hot day in Singapore (by me)

In a data-driven or big data world, I might see only part of the picture - I might see that umbrella sales have gone up and ice cream sales have gone up (but not connect it to a hot day). If I was an ice cream company in the UK, I might conclude that umbrella sales are "correlated with" or "associated with" ice cream sales. This could lead me to send expensive stocks of extra ice cream to Singapore during the rainy season when umbrella sales go up. But this might be an expensive mistake.

If I (a data-driven marketer for ice cream based out of the UK) have a specific causal diagram, then I will avoid looking a bit silly. I would need to build the diagram with my colleagues in Singapore or based on market research. This is how the maths lets the subjectivity in - as long as I use a group of experts my causal diagram has a statistically low rate of error. My expert colleagues in Singapore would tell me that Singaporeans use opaque umbrellas as sun protection when it is hot, so a hot day causes umbrella sales to go up. I might be able to check this by looking at the sales of clear umbrellas, popular in the UK, but tiny, portable, heat stroke-inducing greenhouses on a hot day in Singapore. It's data that would confound or break the causal link between ice creams and umbrella sales - it is counterfactual data.

With my causal diagram, I can quickly discount umbrellas sales as the cause of ice cream sales. I can see from my research that they are both caused by the hot day and verify it with my clear umbrella counterfactual data.

I can also use my causal diagram to look at other effects the hot day might cause to think about the distribution of my ice cream. For example, instead of sending my ice cream stocks to the Botanic Gardens (where visitor numbers often go down when it's too hot), perhaps I can form a partnership with aircon engineers. I can send them ice cream to take to families they are visiting with broken aircons on hot days - that would certainly make a memorable impression by cheering people up when they are hot and grumpy!

Causal diagrams are a great marketing tool

If you consider yourself to be a data-driven marketer, consider building a causal diagram of dots and arrows before you start looking for data. It will help you get your thoughts in order, zero in on useful data quickly, prevent you from confusing correlation with causation, and tell you what patterns you should expect to see if the cause has a significant effect. You can doodle on the back of an envelop or do a full-blown investigation - it depends on the financial stakes of your query.

In addition, brand marketing and brand-building are famously difficult to attribute, though clearly make a difference. My clients and I have seen immediate short-term uplift in awareness and engagement with social selling techniques on LinkedIn - but this is difficult to attribute despite the clear correlation. By using causal diagrams, we may be able to map the causal links, identify data that proves the impact, and give these strategies the credit they seem to deserve when we assess them with our brains (good at seeing causality).

If you are a salesperson, map out what you think causes your sales to go up with a causal diagram. Then work with your marketing colleagues to work out what has the most impact and then optimise to make sure more of what causes sales is happening. You may need to make a bigger contribution upstream, but this should give you better quality leads to work with and hopefully, you'll need to discount less.

You can use this to advocate for marketing resources to be allocated where they cause business uplift.


If you found this article useful, then please feel welcome to follow me on LinkedIn to get more insights.



Dubner, S. D. and Levitt, S. J. (2009). Freakonomics: A rogue economist explores the hidden side of everything.

Harari, Y. N. (2018). Sapiens: A brief history of humankind.

Page, S. (2018) The Model Thinker: What you need to know to make data work for you.

Pearl, J. and Mackenzie, D. (2018). The Book of Why: The new science of cause and effect.