I just listened to an interesting IBM Google hangout about big data called Visions of Big Data’s future. You can watch it here. There were some great experts on the line including James Kobelius (IBM), Thomas Deutsch (IBM), and Ed Dumbill (Silicon Valley Data Science).
The moderator, David Pittman, asked a fantastic question, “What’s taking longer than you expect in big data?” It brought me back to 1992 (ok, I’m dating myself) when I used to work at AT&T Bell Laboratories. At that time, I was working in what might today be called an analytics Center of Excellence. The group was composed of all kinds of quantitative scientists (economists, statisticians, physicists) as well as computer scientists and other IT like people. I think the group was called something like the Marketing Models, Systems, and Analysis department.
I had been working with members of Bell Labs Research to take some of the machine learning algorithms they were developing and applying them to our marketing data for analytics like churn analysis. At that time, I proposed the formation of a group that would consist of market analysts and developers, working together with researchers and some computer scientists. The idea was to provide continuous innovation around analysis. I found the proposal today (I’m still sneezing from the dust). Here is a sentence from it,
Managing and analyzing large amounts of data? At that point we were even thinking about call detail records. It goes on to say, “Specifically the group will utilize two software technologies that will help to extract knowledge from databases: data mining and data archeology. The data archeology piece referred to:
This exploration of the data is similar to what is termed discovery today. Here’s a link to the paper that came out of this work. Interestingly, around this time I also remember going to talk to some people who were developing NLP algorithms for analyzing text. I remember thinking that the “why” around customers were churning could be found in those call center notes.
I thought about this when I heard the moderator’s question not because the group I was proposing would certainly have been ahead of its time – let’s face it AT&T was way ahead of its time with its Center of Excellence in analysis in the first place – but because it’s taken so long to get from there to here and we’re not even here or there yet.