HPE plus MapR: Too much Hadoop, not enough cloud

MapR gives HPE superior big data analytics technology and expertise, but not what HPE needs most

Comments

Cloud killed the fortunes of the Hadoop trinity—Cloudera, Hortonworks and MapR—and that same cloud likely won’t rain success down on Hewlett Packard Enterprise (HPE), which recently acquired the business assets of MapR.

While the deal promises to marry “MapR’s technology, intellectual property, and domain expertise in artificial intelligence and machine learning (AI/ML) and analytics data management” with HPE’s “Intelligent Data Platform capabilities,” the deal is devoid of the one ingredient that both companies need most: cloud.

The problem, in other words, isn’t that MapR wasn’t filled to the brim with smart folks and great technology, as Wikibon analyst James Kobielus insists.

No, the problem is that MapR is still way too Hadoop-y and not nearly cloudy enough in a world filled with “fully integrated [cloud-first] offerings that have a lower cost of acquisition and are cheaper to scale,” as Diffblue CEO Mathew Lodge has said.

In short, MapR may expand HPE’s data assets, but it doesn’t make HPE a cloud contender.

Why cloud matters

Yes, hybrid cloud is still a thing, and will remain so for many years to come. As much as enterprises may want to steer workloads into a cloudy future, 95 per cent of IT remains firmly planted in private data centres. New workloads tend to go cloud, but there are literally decades of workloads still running on-premises.

But this hybrid world, which HPE pitches so loudly (“innovation with hybrid cloud,” “from edge to cloud,” “harness the power of data wherever it lives,” etc.), hasn’t been as big a deal in big data workloads.

Part of the reason comes down to a reliance on old-school models like Hadoop, “built to be a giant single source of data,” as noted by Amalgam Insights CEO Hyoun Park.

That’s a cumbersome model, especially in a world where big data is born in the cloud and wants to stay there, rather than being shipped to on-premises servers. Can you run Hadoop in the cloud? Of course. Companies like AWS do just that (Elastic MapReduce, anyone?).

But arguably even Hadoop in the cloud is a losing strategy for most big data workloads, because it simply doesn’t fit the streaming data world in which we live.

And then there’s the on-premises problem. As AWS data science chief Matt Wood told me, cloud elasticity is crucial to doing data science right:

Those that go out and buy expensive infrastructure find that the problem scope and domain shift really quickly. By the time they get around to answering the original question, the business has moved on. You need an environment that is flexible and allows you to quickly respond to changing big data requirements. Your resource mix is continually evolving—if you buy infrastructure, it’s almost immediately irrelevant to your business because it’s frozen in time. It’s solving a problem you may not have or care about any more.

MapR had made efforts to move beyond its on-premises Hadoop past, but arguably too little, too late.

Brother, can you spare a cloud?

Which brings us back to HPE. In 2015 the company dumped its public cloud offering, instead deciding to “double-down on our private and managed cloud capabilities.” That may have seemed acceptable back when OpenStack was still breathing, but it pigeon-holed HPE as a mostly on-premises vendor trying to partner its way into public cloud relevance. It’s not enough.

Whereas Red Hat, for example, can credibly claim to have deep assets in Kubernetes (Red Hat OpenShift) that help enterprises build for hybrid and multi-cloud scenarios, HPE doesn’t. It has tried to get there through acquisition (e.g., BlueData for containers), but it simply lacks a cohesive product set.

More worryingly, every major public cloud vendor now has a solid hybrid cloud offering, and enterprises interested in modernising will often choose to go with the cloud-first vendor that also has expertise in private data centres, rather than betting on legacy vendors with aspirations for public cloud relevance.

For Google, it’s Anthos. For Microsoft Azure, hybrid was central to the company’s product offering and marketing from the beginning. And for AWS, which at one time eschewed private data centres, the company has built out a slew of hybrid services (e.g., Snowball) and partnerships (VMware) to help enterprises have their cloud cake and eat private data centres, too.

Enter MapR, with its contrarian, proprietary approach to the open source Hadoop market. That approach won it a few key converts, but it never had a broad-based following. Good tech? Sure. Cloudy DNA and products? Nope.

In summary, while I hope the marriage of HPE and MapR will yield happy, cloudy enterprise customers, this “doubling-down” by HPE on technology assets that keep it firmly grounded on-premises doesn’t hold much promise.

Big data belongs in the cloud, and cloud isn’t something you can buy. It’s a different way of operating, a different way of thinking. HPE didn’t get that DNA with MapR.