Skip to main content

Is gdp related to life expectancy?

In the above we've been looking at the distribution of gdp across all years in the data.

But now let's focus on a single year:

d07 <- d[d$year == 2007, ]

and ask if the variables are related to each other.

The simplest and often best way to show the relationship between two variables is a scatter plot. A basic scatter plot is easy:

print(
ggplot( data = d07, aes( x = gdpPercap, y = lifeExp ))
+ geom_point()
+ scale_x_continuous(trans = "log10")
)

img

However, we already suspect this may differ by continent so let's colour by continent - and fix up the axis labels etc. as we have been doing:

p2_3a <- ggplot(
data = d07,
mapping = aes(x = gdpPercap, y = lifeExp, colour = continent)
) +
geom_point() +
scale_color_brewer(palette = "Dark2") +
# we know GDP per capita is best viewed on a log scale
scale_x_continuous(trans = "log10",
breaks = c(0, 1000, 5000, 10000, 50000, 150000)) +
labs(
x = "GDP per capita",
y = "Life Expectancy",
color = "Continent",
title = "Year: 2007"
)

p2_3a

img

It looks like there is a trend here! let's plot some trend lines with geom_smooth - starting with a simple linear model.

p2_3a + geom_smooth(method = "lm", 
color = "black", fill = "grey")

img

We can even fit a separate regression line to each continent.

p2_3b = p2_3a + geom_smooth(aes(color = continent, fill = continent),
method = "lm") +
scale_fill_brewer(palette = "Dark2") +
labs(color = "Continent", fill = "Continent")
p2_3b

img

That's pretty interesting. In all the continents this data suggests a strong relationship between per-capita GDP and life expectancy across countries, with (visually at least) similar slopes.