Big Analytics Roundup (May 18, 2015)

Light news: announcements from Dato, Google, Oracle and Pentaho, plus other cool stuff.
On the PWC technology blog, Alan Morrison and Bo Parker interview Martin Van Ryswyk and Marko Rodriguez of Datastax about graph analytics. PWC’s headline writer gets it wrong; the article is about graph engines and not graph databases. Special-purpose graph databases, like special-purpose columnar databases, are a dead end; graph analytics will be incorporated into general-purpose tooling. The evidence? They’re interviewing guys from Datastax and not Neo.
In Data Science Central, “Data Science Girl” surveys top public data repositories, so we don’t have to keep using the 1998 KDD Cup data.
Adatao
Adatao CEO blogs about why he’s placing his chips on Spark.
Apache Flink
On the Flink blog, Fabian Huske provides one more reason not to care about Flink.
Apache Geode
Failing to sell GemFire, Pivotal open-sourced it as Apache Geode. InfoWorld reports.
Apache Spark
On the Databricks blog, Masaru Dobashi et. al. describe how NTT uses Spark on thousand-node clusters for operational analytics at scale. Nick Heudecker, call your office.
Nick Amato demonstrates how to classify customers with Spark MLLib.
Justin Kestelyn summarizes some lessons learned working with Spark.
The Spark team announces Spark Summit Europe, to be held October 27-29 in Amsterdam.
Dato
Data announces release of GraphLab Create, which includes support for scikit-learn models, a label propagation toolkit and a number of other new features.
By the way, it appears the folks at Dato forgot to Google the name before rebranding.
Google Cloud
Google announces beta release of Bigtable, a massively scalable NoSQL database. VentureBeat reports.
Hortonworks
HDP releases its Q1 financials. Revenue more than doubled, while the operating loss doubled, a great example of negative operating leverage. Good news: HDP’s variable margin on services turned positive, which means they don’t have to give away consulting services as much as they did last year. Wall Street was pleased.
Lattice Engines
In VentureBeat, Barry Levine kills two birds with one stone, touts the GrowthBeat Summit and Lattice Engines’ new features. One assumes the latter sponsors the former.
MarkLogic
NoSQL vendor MarkLogic secures a generous $102 million Series F round.
Oracle Analytics
Oracle announces spatial and graph analytics for Big Data. (h/t Oliver Vagner)
Pentaho
Pentaho announces integration with Apache Spark, enabling orchestration of Spark jobs. Coverage here, here, here, here, here, here, and here. Reporting this story, Alex Woodie trolls another spurious Spark “concern.”
Predixion Software
Predixion’s Marcom people show they’ve heard about IoT.
Wolfram Research
In VentureBeat, Jordan Novet reviews Wolfram’s new image identification tool, which leverages deep learning.