The promise of big data has been, well, big. Machines can learn from our actions online to figure out what we like, what we’ll do next, and — most importantly to the raging capitalists among us — what we’ll buy.
And that’s really why everyone is spending so damn much energy synthesizing the piles of data created by the Web 2.0 “stream,” right? There’s so much behavioral and factual data, and we’ve spent so much energy synthesizing it, that we had to teach machines to do it. Thus, the rise of Big Data, a sector of technology that’s taken in $1.15 billion in venture capital over the last four quarters.
The category is a sortof catch-all to include anything that chews up petabytes or exabytes of data and spits it out in the form of “actionable insights.”
But it’s missing one element that many tech companies are realizing is impossible to live without: a human touch.
This is not a comfortable reality for tech companies. They don’t like throwing people at a problem. The idea of a startup, after all, is to build a product that can scale, meaning each additional dollar earned cost less to bring in than the previous dollar. Robots make it easy to achieve that goal. A bloated headcount? Not so much.
No matter, I’ve talked to a number of startups that began building algorithmic recommendation engine-type businesses, only to later scale back on that plan when they realized people crave recommendations curated by humans. Fun Org, which launched last week, is the perfect example.
Fun Org’s founders started out with the goal of recommending events with algorithms. “It was like taking Hunch into the real world,” Trantow says. But in beta tests, they found that users responded more to the events curated by humans, so they dropped the algorithm approach and started forging partnerships.
Kevin O’Connor built an entire business around that idea. His company, FindtheBest, powers the Web’s biggest comparison engine. He could have built an all-algorithm driven site — his background is in digital ads as founder, CEO, and Chairman of Doubleclick. but he quickly realized the limitations. Now, FindtheBest’s database of hundreds of millions of listings which total almost a billion pieces of information is built by a 60-person team. FindtheBest supplements its human-curated information with existing sources of data from scores of public records. But even that must be vetted by a human.
Google Squared didn’t work for that exact reason. The Labs project which compared products based on unstructured data from across the Web was once expected to crush up-and-coming “computational knowledge engine” Wolfram Alpha. It was shut down last year.
Regarding FindtheBest’s approach to structured data comparisons built by humans, O’Connor says his approach was a head-scratcher at first. “People thought two years ago that we were insane, and now there is a consensus that, ‘You approached this the right way,’” O’Connor says.
Fast forward to today, and he is not alone. Twitter hires editors as “media evangelists” for its platform. Its new weekly “what you missed” email is a curated presentation of content. Within Twitter, media outlets like BBC are finding that feeds made up of human-edited Tweets are more engaging and valuable than robo-Tweets coming directly from blog posts, and changing their social media strategies accordingly.
Google’s PageRank is arguably its most powerful weapon, and even though it’s not determined by a human, it is based on human connections — how closely a webpage is linked to other webpages. Google, famously declared, “No humans were harmed or even used in the creation of this page,” on its news results pages, but recently the search engine did an about-face, offering Editor’s Picks which are curated by, well, editors. Bing has an Editor’s Picks section too. Human curation is hot, guys. There’s even a whole conference dedicated to Curators next week in New York, although I don’t know how much time that event will devote to big data.
The excitement over big data right now is often just that: excitement. Big data is not as simple as a few inputs and outputs, warns Scott Brave, co-founder and CTO of ecommerce personalization software maker Baynote. “Modeling isn’t magic, there is always a human element in there, and people often forget that,” he says. Framing data in a way that’s actually useful takes as much art as it does science.
[Illustration by Hallie Bateman]