The Baseline

A big part of statistics is comparisons, and perhaps more importantly, to figure out what to compare things to. Perspective changes with the baseline.

When you look at data, it’s important to consider this baseline — this imaginary place or point you want to compare to. Of course, the right answer is different for various datasets, with variable context, but let’s look at some practical examples in R.

First you have to load the data, which is in CSV format. Use read.csv() to bring it in. We’re going to look at the cost of gas and the Consumer Price Index, as published by the Bureau of Labor Statistics.

# Load the data.
cpi <- read.csv(“data/cpi-monthly-us.csv”, stringsAsFactors=FALSE)
gas <- read.csv(“data/gas-prices-monthly.csv”, stringsAsFactors=FALSE)

When you compare historical prices, you have to account for inflation. The baseline is not only how much gas costs now, but how much a dollar is worth. A dollar today isn’t worth the same as a dollar thirty years ago.

This is where the Consumer Price Index comes into play. It represents how much households have to pay for goods and services. Divide the CPI today with the CPI during a different time and you get a multiplication factor to estimate the adjusted price per gallon of gas. In other words, you want to know how much as gallon of gas during a past year would cost in today’s dollars.

The code below provides adjusted cost.

# Adjust gas price for inflation
gas.cpi.merge <- merge(gas, cpi, by=c(“Year”, “Period”))
gas.cpi <- gas.cpi.merge[,-c(3,5)]
colnames(gas.cpi) <- c(“year”, “month”, “gasprice.unadj”, “cpi”)
currCPI <- gas.cpi[dim(gas.cpi)[1], “cpi”]
gas.cpi$cpiFactor <- currCPI / gas.cpi$cpi
gas.cpi$gasprice.adj <- gas.cpi$gasprice.unadj * gas.cpi$cpiFactor

Now you can make the  graphs with adjusted prices.

curr <- gas.cpi$gasprice.adj[dim(gas.cpi)[1]]
gasDiff.adj <- gas.cpi$gasprice.adj – curr
barCols.diff.adj <- sapply(gasDiff.adj,
    function(x) {
        if (x < 0) {
        } else {
barplot(gasDiff.adj, border=NA, space=0, las=1, col=barCols.diff.adj, main=”Adjusted dollar difference from September 2013″)

