This past week was Big Data Week for those of you that don’t know, a week of talks and events held worldwide to “unite the global data communities through series of events and meetups”.
Viafoura put on the events this year for Toronto and was kind enough to extend an invitation to myself to be one of the speakers talking on data visualization and how that relates to all this “Big Data” stuff.
Paul spoke detecting fraud online using visualization and data science techniques. Something I often think about when presenting is how to make your message clear and connect with both the least technical people in the audience (who, quite often, have attended strictly out of curiosity) and the most knowledgeable and technically-minded people present.
I was really impressed with Paul’s visual explanation of the Jaccard coefficient. Not everyone understands set theory, however almost everyone will understand a Venn diagram if you put it in front of them.
So to explain the Jaccard index as a measure of mutual information when giving a presentation, which is better? You could put the definition up on a slide:

