Correlation does not imply causation

I have just finished reading an assignment.
It was written by one of my students – Lisa.
Lisa has been working with me on a small research project as a form of preparation
for Masters in Psychology at the University of Portsmouth.

Generally, Lisa has written an impressive piece of work (4.000 words) with a coherent structure, convincing argumentation, a sensible choice of source material. However, as it is often the case with academic writing, not everything was perfect.

There were some minor issues which had to be addressed: mistakes in referencing, no signposting, too descriptive at times, writing in the first person.

Crucially, Lisa has also mistaken correlation with causation.
Since it is a frequent mistake in both undergraduate assignments and masters dissertations, I think I should write a few words about it.

Correlation vs. Causation

As you all probably know correlation is basically a mutual relationship between two (or more) variables.
In other words, correlation shows whether and how strongly variables are related.
The taller (height) you are the heavier (weight) you tend to be.  Height and weight often correlate.
Age and amount of hair often correlate too; younger people tend to have more hair on their heads than older people.
Does it mean that the age itself causes the hair loss?
Hair loss can be caused by hormonal changes, medical conditions or medications among many other factors.

On the other hand, causation exists when one variable actually causes another. Drinking alcohol causes accidents.
Smoking causes cancer. There is a strong relationship between smoking (cause) and cancer (effect) which has been evidenced by experimental studies.

For many students, both concepts seem similar, but they are not the same thing.
Many examiners, tutors, and markers get angry when they see students confusing these two basic concepts.

Back to Lisa

Lisa in her assignment discussed a link/mutual relationship between depression and suicide.
She quoted some correlational studies and concluded that depression frequently causes suicide.

I read the studies she included in her assignment and on the basis of those studies, I could only say that:
‘people who suffer from depression tend to have higher rates of suicide’.
There was a correlation between those two variables but nothing more than that. We cannot simply say that one causes the other.
Simply, the researchers did not conduct experimental studies to prove causation.
There might have been some other factors at play (confounding variables they have not measured for) such as  SES (socio-economic status), age, sex etc.

Lisa needed to rephrase her claims, also where possible I asked her to add experimental studies proving (if possible) the causal relationship (for instance, Randomised Controlled Studies).


Now, it is important to mention that correlation can indeed suggest that causation actually exists.
Phrasing differently, where is correlation often is causation.
But it does not prove it!

More warning

Have you heard that the rates of crime have been known to increase when ice cream sales increase too?
You would not assume that ice cream can make you more violent. Would you?

I found another funny example of spurious correlation.
Did you know that the number of people who drowned by falling into a swimming-pool correlates with the number of films Nicolas Cage appeared in? Does one factor cause the other?



Students tend to assume that correlation proves that one variable causes the other.
Apparently, it is a human nature that we look for patterns which help us explain the world around us.
However, it is often a logical fallacy and a flaw in reasoning and Lisa is not the only person making this mistake.

Some examples from research papers


‘a strong correlation’ is not causation! However ‘an impressive number of research studies’ might suggest the causation actually exists.
Remember that emotive words such as ‘impressive’ should be avoided in academic writing since they sound rather subjective.
By the way, who knows what ‘a moderately strong correlation’ is? I have heard about weak, moderate, strong correlation only.


‘statistically significant correlation’ is not causation either


Finally, there is a ‘statistically significant correlation’ between stork populations and human birth rates across Europe.
Does it prove that storks deliver babies?






