Are We Solving The Wrong Problems With Machine Learning?

Let’s talk about corn.

Corn and how it gets from growing in fields onto your table.

Below is a video of a corn harvesting machine:

And here is a video of people gathering corn:

So, I hear you say, what all does this have to do with machine learning?

A lot, as it so happens.

Continue reading “Are We Solving The Wrong Problems With Machine Learning?”

Training an RNN on the Archer Scripts

Introduction

So all the hype these days is around “AI”, as opposed to “machine learning” (though I’ve yet to hear an exact distinction between the two), and one of the tools that seems to get talked about most is Google’s Tensorflow.
I wanted to get playing around with Tensorflow and RNN’s a little bit, since they’re not the type of machine learning I’m most familiar with, with a low investment in time to see what kind of outputs I could come up with.

Background

A little digging and I came across this tutorial, which is a pretty good brief overview intro to RNNs, and uses Keras and computes things character-wise.
This is turn lead me to word-rnn-tensorflow, which expanding on the works of others, uses a word-based model (instead of character based).
I wasn’t about to spend my whole weekend rebuilding RNNs from scratch – no sense reinventing the wheel – so just thought it’d be interesting to play around a little with this one, and perhaps give it a more interesting dataset. Shakespeare is ok, but why not something a little more culturally relevant… like I dunno, say the scripts from a certain cartoon featuring a dysfunctional foul-mouthed spy agency?

Continue reading “Training an RNN on the Archer Scripts”

Toronto Data Science Meetup – Machine Learning for Humans

A little while ago I spoke again at the Toronto Data Science Group, and gave a presentation I called “Machine Learning for Humans”:

I had originally intended to cover a wide variety of general “gotchas” around the practical applications of machine learning, however with half an hour there’s really only so much you can cover.

The talk ended up being more of an overview of binary classification, as well as some anecdotes around mistakes in using machine learning I’ve actually seen in the field, including:

  • Not doing any model evaluation at all
  • Doing model evaluation but without cross-validation
  • Not knowing what the cold start problem is and how to avoid it with a recommender system
All in all it was received very well despite being review for a lot of people in the room. As usual, I took away some learnings around presenting:
  • Always lowball for time (the presentation was rushed despite my blistering pace)
  • Never try to use fancy fonts in Powerpoint and expect them to carry over – it never works (copy paste as an image instead when you’ve got the final presentation)
Dan Thierl of Rubikloud gave a really informative and candid talk about what product management at a data science startup can look like. In particular, I was struck by his honesty around the challenges faced (both from technical standpoint and with clients), how quickly you have to move / pivot, and how some clients are just looking for simple solutions (Can you help us dashboard?) and are perhaps not at a level of maturity to want or fully utilize a data science solution.
All in all, another great meetup that prompted some really interesting discussion afterward. I look forward to the next one. I’ve added the presentation to the speaking section.