BigData. So what?

Tuesday, June 05, 2012

Sometimes it takes a bus trip to connect the dots. In my case today these were BigData and a Wired Magazine article.

We have all been hearing a lot about big data lately. If a vendor has little to say, or possibly said everything they can, about Cloud then they just search and replace the marketing materials with the phrase "Big Data".  We are not at the stage where McDonalds has decided to replace the BigMac with the BigData burger so the consumer world is safe for the moment, but most CIOs are probably getting their in tray full of promotions and case studies.

Whilst I get big data and see its value, I have personally struggled with the realities of execution. We have been reading about the increasing demand of developers skilled in Hadoop and I have a college who is a CCIE, got into Cloud and is now chasing the Hadoop angle. But to me BigData itself brought no real shift in ability to execute here. It might be cheaper and easier to store and process big data these days, but the insights have always been a human effort. the human effort to develop the analytics takes intellect and scale. There was the rub, not all humans have the same intellect and humans don't scale in the specialist areas. I have a friend who works for Oracle in demand planning. His is real smart at building data mining for global companies that need to forecast all sorts of whacky things. Yet he is very specialised and uses some real high end software. The gap between those people with big data, and those who can do something with it, has always irked me.

So today I am on the bus reading Wired on my iPad, as you do, and read an article "Can an Algorithm Write a Better News Story than a Human Reporter?". Have a scan thru the article but the premise is that given large amounts of statistical data companies such as Narrative Science and turn it into a news story that is very insightful. They started out doing this with children's baseball games. Feed in the play by play data and it generates a story such as
Friona fell 10-8 to Boys Ranch in five innings on Monday at Friona despite racking up seven hits and eight runs. Friona was led by a flawless day at the dish by Hunter Sundre, who went 2-2 against Boys Ranch pitching. Sundre singled in the third inning and tripled in the fourth inning … Friona piled up the steals, swiping eight bags in all …
Baseball, financial markets, they can do some amazing stuff. Many companies are actually using machines to find insights and produce prose. As mentioned
Once Narrative Science had mastered the art of telling sports and finance stories, the company realized that it could produce much more than journalism. Indeed, anyone who needed to translate and explain large sets of data could benefit from its services. Requests poured in from people who were buried in spreadsheets and charts. It turned out that those people would pay to convert all that confusing information into a couple of readable paragraphs that hit the key points.
And the subject matter keeps getting more diverse. Narrative Science was hired by a fast-food company to write a monthly report for its franchise operators that analyzes sales figures, compares them to regional peers, and suggests particular menu items to push. What’s more, the low cost of transforming data into stories makes it practical to write even for an audience of one. Narrative Science is looking into producing personalized 401(k) financial reports and synopses of World of Warcraft sessions—players could get a recap after a big raid that would read as if an embedded journalist had accompanied their guild. “The Internet generates more numbers than anything that we’ve ever seen. And this is a company that turns numbers into words,” says former DoubleClick CEO David Rosenblatt, who sits on Narrative Science’s board. “Narrative Science needs to exist. The journalism might be only the sizzle—the steak might be management reports.” 
This is where the dots connected and I became a lot more relaxed about big data. Here we have the birth of what can start to give reality to big data capture and processing. Whether you view it as AI, clever algorithms or plain ole automation does not matter. Looking forward you can see how companies can cheaply and easily generate business insights from the data they collect.

Until these analytic services mature you might want to brush up on your hadoop skills, but in the future you might just start getting more emails from a automated account like the following.

"Rodos, yesterday there was a flood of traffic on the fibre channel network that was generated from workloads in the Melbourne IaaS availability zone. Looks like this was mostly from a specific customer and I also picked up that their Unified Communications workloads in the UCaaS nodes in Singapore peaked. The company in question just listed on the stock exchange in Hong Kong and forecast interest in their services, if it continues at the rate, will cause increased workload that will take the Melbourne availability zone B to 90% capacity. Last time zone B hit 88% capacity (Sept 2014) SLAs for 2 customers were broken. Just a heads up, regards Siri".

