2 thoughts on “Creepypasta – Votes vs. Rating (& learning ggplot2)”

  1. You might notice that the quadratic fit in the ggplot2 version is different from the other two. That is because you give ggplot the log-transformed data as the response variable so the lm fit is on the log data, not the original data. To more closely reproduce the first two examples use coord_trans(), e.g.

    > gplot <- ggplot(data, aes(Rating, Votes)) +
    + geom_point(col=rgb(0,0,1,0.25), pch=16, cex=2) +
    + geom_smooth(method="lm", formula=y~poly(x,2)) +
    + labs(title="Creepypasta Stories, Votes vs. Ratings") +
    + theme_bw() + coord_trans(y = "log10") +
    + theme(axis.text=element_text(size=14), axis.title=element_text(size=14), plot.title=element_text(size=16, face="bold"))
    > gplot

    See http://docs.ggplot2.org/current/coord_trans.html for some discussion of this.

