At the moment it seems that every other day brings announcements of a new “Big Data” start-up, partnership, conference, report etc. Just about every man and his dog have made some form of press release related to “Big Data” recently. Every week I stumble across some “Big Data” relevant conference or event that I’ve just missed out on.
If you’re not familiar with the term Hype cycle – in a nutshell it is a curve that describes the level over-enthusiasm and disappointment about any particular technology over a period of time.
This curve is characterised by a peak of ‘inflated expectations’ followed by the ‘trough of disillusionment’. If the technology really proves it’s worth it may eventually recover some of its credibility; but not to previously over-hyped levels. Many technologies never make it back up from the ‘trough’ stage, or simply fizzle out of existence along the way.
You’ve likely witnessed this phenomenon already, can you remember the hype around the virtual world ‘Second Life‘?
I would argue that “Big Data” as we know it is rapidly approaching – it not already at – the top of the hype cycle curve.
Early start-ups like Cloudera are maturing and have announced new rounds of funding. Goliath tech companies have largely been taken by surprise and have responded by forging tactical partnerships and acquisitions – rather than be left behind. Most notable of all, grumbles of ‘disappointment’ and ‘disillusionment’ from traditional enterprises have begun as they struggle to embrace new ways of working – or simply have bad experiences with technology and/or people that are supposedly ‘expert’ in the field.
The problem is not necessarily the technology, the concept of Big Data itself really democratises data processing capabilities that were previously only available from large High Performance Computing facilities. The majority of the problem is the FUD and confusion that has been churned up by mass media and some companies.
One of the biggest criticisms I’ve experienced first hand is “the Valley blinkers”; just like the horse in the picture, *a lot* of the Big Data start-ups have trouble seeing anything that’s not based in Silicon Valley and/or ends with .COM
As you can imagine a large UK-based FTSE100 company with billions in revenue and millions of customers, sometimes has trouble keeping the attention of Valley-fresh Big Data hipsters. This is largely the same across Europe with the addition of a language barrier.
“Big Data” isn’t the hard bit, thanks to the Apache Big Top (and similar) distributions, the tools are easy to set-up – the proprietary Cloudera Manager software makes it (almost) a one-click installation. The hard bit is *actually* solving the business needs, and to do that you need skilled people.
Finding skilled people is really quite difficult (I know I have recruited some), while Cloudera are helping by churning out over a thousand certified individuals a month – it really comes back to the age old problem of finding the *right* people. Anyone with basic Java skills can go through the training material, become certified and land themselves a juicy $800/day contracting opportunity at a large enterprise keen to glean the most from this “Big Data” opportunity they’re investing in.
This is where the disillusionment starts to creep in, you can have a room full of expensive ‘certified’ contractors and go nowhere fast. The slow pace of tangible results breeds concern about the worth of the investment and starts to discredit the value of these certifications. I know of a couple of cases here in London where this has already happened; the build-our-own initiative has been canned and they’ve just splurged cash to Oracle for Exalytics instead.
As I eluded to earlier, your traditional enterprise is used to buying ‘shrink wrapped’ products and services from large technology companies. Chief financial executives have been trained that technology is expensive but generally delivers the results you expect. The concept of “embracing Big Data” is decidedly different and could be mistaken for a case of “technology for technologies sake”, otherwise known as a “tech-gasm” and a waste of money with little to no return.
However, on the face of it, most companies have *identical* business problems to solve. Most of them have customers of one sort or another, most of them have web sites, most of them have desires to be more intelligent about dealing with their customers and predicting trends in behaviours to identify both fraud and customers that are likely to leave/cancel their services.
If the business problems are mostly the same, why oh why are teams of expensive contractors sitting around and struggling with the basic Big Data tools like Pig, Hive and Map/Reduce. It is almost as ludicrous as the idea of companies hiring rooms of developers to build their own bespoke Customer Relationship Management (CRM) systems.
The reason is simple, the tools needed to solve these business problems don’t exist yet.
In amongst all the hype and the noise from the “Big Data Bandwagon”, I’d argue than only a handful of start-ups / companies have grasped this and are doing something about it. Naturally I reckon the “Data Science” start-up TUMRA is a perfect example of this (disclaimer: I am the CTO co-founder of TUMRA).
The principle thing that will lift “Big Data” from the “trough of disillusionment” is the start-ups and companies that are developing the algorithms, tools and technology that actually solves the business needs – the fact they use some of the Big Data ecosystem tools under-the-hood is really an aside.
Thanks for reading! Feel free to engage in discussion in the comments below or on Twitter (@cotdp).
If some of the points raised in this post hit a little close to home and you have rooms full of Big Data Hipsters… it’s not too late please feel free to contact us at TUMRA.