Today, I participated in the "Innovations and Big Data Workshop" in San Jose, CA. It was organized by Diego Klabjan, who is heading up Northwestern University's Master of Science in Analytics program. Thanks also to Terry Cryan, Director of Meetings at INFORMS.
One of the topics that each speaker was asked to address was, "What do you mean by 'Big Data'?" Here are some of the answers offered up by various presenters:
- Not just about size, but also about emerging data types (that are not relational database friendly, but rather semi-structured at best).
- Must include data that is heterogeneous, over time.
- 3Vs: Volume, Velocity, Variety (from an IDC White Paper)
- Whatever causes you to work outside your comfort zone (in terms of volume, complexity, velocity, and technique).
- A lot more than just having a lot of data.
I like the "outside your comfort zone" definition. It's kind of like the old ad about reruns on NBC… "if you haven't seen it, it's new to you."
While I'm on the topic, there was an interesting quote about Big Data in a presentation by Hilary Mason, Chief Scientist at bit.ly the other week, during a "Droptalks" presentation at Dropbox. She said,
We work in an area that is nominally called Big Data. I really hate this phrase; it's just data. I dislike the phrase because I spend most of my time figure out ways to take the Big Data and make it small enough to do something interesting with it.
You can watch the video, posted on YouTube, below. (The quote is at about the 7 minute mark.) She also had an interesting definition of Analytics, but I'll save that for another post.