Introduction
![]() |
Your average economists. |
Background
# Download data
for i in $(seq 1990 2014)
do wget http://www.energy.gov.on.ca/fuelupload/ONTREG$i.csv
done
# Retain the header
head -n 2 ONTREG1990.csv | sed 1d > ONTREG_merged.csv
# Loop over the files and use sed to extract the relevant lines
for i in $(seq 1990 2014)
do
tail -n 15 ONTREG$i.csv | sed 13,15d | sed 's/./-01-'$i',/4' >> ONTREG_merged.csv
done
The WTI data I got from The Federal Reserve Bank of St. Louis, and the forecasts from the US Energy Information Administration.
Analysis
Wow, that really is a cliff, isn’t it? The average Ontario Gas Price hasn’t been as low as it was in Dec 2014 since the fall of 2010 (Sep 2010, $1.01) and the WTI not since about May of that year.
Now to the fun stuff. Let’s read the data into R and do some hacky time series analysis.
library(ts)
library(forecast)
# Read in the Ontario Gas Price Data
data <- read.csv(file="ONTREG_merged.csv", header=T, sep=",")
# Read in the WTI oil price data
WTI_data <- read.csv(file='DCOILWTICO.csv',header=F, col.names=c("Date", "Value"))
# Create a time series object for the WTI and Ontario Avg
WTI <- ts(data=WTI_data$Value, frequency=12, start=c(1990,1), end=c(2014,12))
ON <- ts(data=data$ON.Avg, frequency=12, start=c(1990,1), end=c(2014,12))
# Plot and compare
combined <- cbind(WTI, ON)
plot(combined)
We get a comparison plot of the data:
And we can look at the Ontario Gas Price as a function of the WTI. Linear is on the left, log-log on the right.
Next we build lm model objects and look at the diagnostics. I’ll spare the details, but I feel better about the log-log, so we’ll go with that.
# Create linear models (normal and log-log)
l1 <- lm(ON ~ WTI, data=combined)
l2 <- lm(log(ON) ~ log(WTI), data=combined)
# Compare relative performance
summary(l1)
summary(l2)
plot(l1)
plot(l2)
# Plot
plot(ON ~ WTI, data=combined, pch=16, cex=0.3)
abline(l1)
plot(log(ON) ~ log(WTI), data=combined, pch=16, cex=0.3)
abline(l2)
# Read in WTI forecast data
WTI_forecast <- read.csv(file="WTI_forecast.csv", header=F, sep=",", col.names=c("Date", "Value"))
# Forecast Ontario Gas Price
fit <- forecast(l2, newdata=data.frame(WTI=WTI_forecast$Value))
# Unlog
fit$mean <- exp(fit$mean)
fit$lower <- exp(fit$lower)
fit$upper <- exp(fit$upper)
fit$x <- exp(fit$x)
# Plot
plot(fit, ylab='Ontario Average Gas Price (cents/L)')
And there you have it! Putting the forecast data (blue line) and the WTI forecast back into our original graph, we can compare the two:
It’s that easy, right? That’s all there is to it?
Conclusion
In these cases, testing the stationary assumption and then massaging of the data (differencing & deseasonalization) is usually required beforehand to handle the problem, or other models which do not have as strict assumptions as linear regression are more appropriate. For more on this, see the excellent book “Forecasting: Principles and Practice” by Rob J. Hyndman and George Athanasopoulos.
References & Resources
Fuel Price Data: Regular Unleaded Gasoline 2014 (Ontario Ministry of Energy):
Crude Oil Prices: West Texas Intermediate (WTI) (Federal Reserve Bank of St. Louis)
Short-Term Energy Outlook (U.S. Energy Information Administration)