Why open source is the ‘new normal’ for big data

Big Data analytics machine learning

It’s no secret that Hadoop and Apache Spark are the hottest technologies in big data, but what’s less often remarked upon is that they’re both open-source.

Mike Tuchen, a former Microsoft executive who is now CEO of big-data vendor Talend, thinks that’s no coincidence.

“We’re seeing a changing of the guard,” he said. “We expect the entire next-generation data platform will be open source.”

The platform he’s referring to is an expanded Hadoop ecosystem, in which the whole stack is open source. “It’s the new normal,” he said.

As a provider of integration technologies for that platform, Talend has placed a significant bet of its own on Hadoop, Spark, and open source in general, so Tuchen’s enthusiasm isn’t exactly surprising. Talend offers products focused on big data, cloud and application integration, among others, and all are based on open-source software.

Still, Talend’s bet seems to be paying off. The company will celebrate its 10th anniversary this year, and it claims big-name customers like GE, Citi, Lufthansa, Orange and Virgin Mobile. It’s also in the middle of a major expansion. At the end of 2015, it was selling its products in five countries; by end of this year, it will be selling in 15, Tuchen said. Making that happen will mean hiring about 200 new people, he said, bringing the company’s total head count to about 750.

Customers appreciate how open source allows them to “try before you buy,” but they also see the open-source world as evolving more rapidly than the proprietary world because of the sharing that takes place among developers.

“The whole Hadoop ecosystem is moving faster than it could if it were just one vendor,” Tuchen said. “When you look at it that way, it’s hard to see how the world would ever change back.”

 

[Source:- Javaworld]