Big Analytics Roundup (March 9, 2015)
Here’s a roundup of interesting Big Analytics news and analysis from the past week. Featured this week: Hortonworks, Alpine, Spark and H2O.
- Matt Asay, writing in InfoWorld, deconstructs Hortonworks’ earnings fiasco, and with it the “100% open source” business model.
Alpine Data Labs
- VentureBeat reports a story that Alpine Data Labs claims 10X growth in user count and billings year over year.
- MarketWired reports the same story.
- ITBusinessNet too.
There is no supporting press release from Alpine Data Labs. The VentureBeat story includes the nugget that Alpine currently has “more than 60” customers; an insider tells me that the number is closer to 75, roughly twice as many as last year. Alpine has changed its selling model, hiring its own sales force instead of selling through EMC and Pivotal. This also means that Alpine has changed its messaging from “we run on Greenplum and PostgresSQL, but mostly on Greenplum” to “we run on anything.” This is an aspiration, to be sure, but a good one.
Alpine has also changed its pricing model from a perpetual server-based model to a user-based subscription model.
Separately, Ventana Research publishes a positive review of Alpine Chorus 5.0.
- Jonathan Buckley of Qubole argues that the three open source projects that transformed Hadoop are Hive, Spark and Presto. It’s an odd choice. Hive is certainly a key project and Spark is red hot; Presto, not so much.
- Data prep engine vendor Paxata announces a new release that runs on Spark, releases benchmark report showing significant performance improvements.
- Databricks announces selection of Databricks Cloud as preferred platform for B2B vendor Radius Intelligence, publishes case study.
- Forbes profiles Databricks CEO Ion Stoica.
- Ian Lumb offers eight reasons why Spark is hot.
- Databricks published a slideshare about Spark DataFrames, which will be available in Spark 1.3 later this month.
- From the Cloudera blog, an excellent post showing how to build an application for financial markets risk calculations in Spark.
- In an interview with KDNuggets, Ted Dunning touts Mahout and H2O over Spark.
- H2O.ai announces Cloudera certification for its Sparking Water interface to Spark.
CMSWire rehashes the Gartner Magic Quadrant without adding value. The author notes breathlessly that “many KNIME enthusiasts are data miners”, and “on the downside, (RapidMiner’s) user base is mostly data scientists”; as if these points are news, and as if there is something extraordinary about data miners and data scientists using data mining and data science tools.