Big Analytics Roundup (March 14, 2016)
HPE wins the internet this week by announcing the re-re-release of Haven, this time on Azure. The other big story this week: Flink announces Release 1.0.
Third Time’s a Charm
Hewlett Packard Enterprise (HPE) announces Haven on Demand on Microsoft Azure; PR firestorm ensues. Haven is a
loose bundle of software assets salvaged from the train wreck of Autonomy, Vertica, ArcSight and HP Operations Management machine learning suite, originally branded as HAVEn and announced by HP in June, 2013. Since then, the software hasn’t exactly gone viral; Haven failed to make KDnuggets’ list of the top 50 machine learning APIs last December, a list that includes the likes of Ersatz, Hutoma and Skyttle.
One possible reason for the lack of virality: although several analysts described Haven as “open source”, HP did not release the Haven source code, and did not offer the software under an open source license.
Other than those two things, it’s open source.
In 2015, HP released Haven on Helion Public Cloud, HP’s failed cloud platform.
So this latest announcement is a re-re-release of the software. On paper, the library looks like it has some valuable capabilities in text, images video and audio analytics. The interface and documentation look a bit rough, but, after all, this is a
first third release.
Jim’s Latest Musings
Angus Loten of the WSJ’s CIO Journal interviews SAS CEO Jim Goodnight, who increasingly sounds like your great-uncle at Thanksgiving dinner, the one who complains about “these kids today.” Goodnight compares cloud computing to mainframe time sharing. That’s ironic, because although SAS runs in AWS, it does not offer elastic pricing, the one thing that modern cloud computing shares with timesharing.
Goodnight also pooh-poohs IoT, noting that “we don’t have any major IoT customers, and I haven’t seen a good example of IoT yet.” SAS’ Product Manager for IoT could not be reached for comment.
Meanwhile, SAS held its annual analyst conference at a posh resort in Steamboat Springs, Colorado; in his report for Ventana Research, David Menninger gushes.
Herbalife Messes Up, Blames Data Scientists
— Several items from the morning paper this week:
- Adrian Colyer explains CryptoNets, a combination of Deep Learning and homohorphic encryption. By encrypting your data before you load it into the cloud, you make it useless to a hacker.
- Adrian explains Neural Turing Machines.
- Adrian explains Memory Networks.
- Citing a paper published by Google last year, Adrian explains why using personal knowledge questions for account recovery is a really bad thing.
— Data Artisans’ Robert Metzger explains Apache Flink.
— In a video, Eric Kramer explains how to leverage patient data with Dataiku Data Science Studio.
— In InfoWorld, Serdar Yegulalp examines Flink 1.0 and swallows whole the argument that Flink’s “pure” streaming is inherently superior to Spark’s microbatching.
— On the MapR blog, Jim Scott offers a more balanced view of Flink, noting that streaming benchmarks are irrelevant unless you control for processing semantics and fault tolerance. Scott is excited about Flink ease of use and CEP API.
— John Leonard interviews Vincent de Lagabbe, CTO of bitcoin tracker Kaiko, who argues that Hadoop is unnecessary if you have less than a petabyte of data. Lagabbe prefers Datastax Enterprise.
— Also in InfoWorld, Martin Heller reviews Azure Machine Learning, finds it too hard for novices. I disagree. I used AML in a classroom lab, and students were up and running in minutes.
Open Source Announcements
CEO Mike Koehler demonstrates confidence in TDC’s future by selling 11,331 shares.
— Databricks announces that adtech company Sellpoints has selected the Databricks platform to deliver a predictive analytics product.