Big Analytics Roundup (May 18, 2015)

Light news: announcements from Dato, Google, Oracle and Pentaho, plus other cool stuff.

On the PWC technology blog, Alan Morrison and Bo Parker interview Martin Van Ryswyk and Marko Rodriguez of Datastax about graph analytics.  PWC’s headline writer gets it wrong; the article is about graph engines and not graph databases.  Special-purpose graph databases, like special-purpose columnar databases, are a dead end; graph analytics will be incorporated into general-purpose tooling.  The evidence?  They’re interviewing guys from Datastax and not Neo.

In Data Science Central, “Data Science Girl” surveys top public data repositories, so we don’t have to keep using the 1998 KDD Cup data.


Adatao CEO blogs about why he’s placing his chips on Spark.

Apache Flink

On the Flink blog, Fabian Huske provides one more reason not to care about Flink.

Apache Geode

Failing to sell GemFire, Pivotal open-sourced it as Apache Geode.  InfoWorld reports.

Apache Spark

On the Databricks blog, Masaru Dobashi et. al. describe how NTT uses Spark on thousand-node clusters for operational analytics at scale.  Nick Heudecker, call your office.

Nick Amato demonstrates how to classify customers with Spark MLLib.

Justin Kestelyn summarizes some lessons learned working with Spark.

The Spark team announces Spark Summit Europe, to be held October 27-29 in Amsterdam.


Data announces release of GraphLab Create, which includes support for scikit-learn models, a label propagation toolkit and a number of other new features.

By the way, it appears the folks at Dato forgot to Google the name before rebranding.

Google Cloud

Google announces beta release of Bigtable, a massively scalable NoSQL database.  VentureBeat reports.


HDP releases its Q1 financials.  Revenue more than doubled, while the operating loss doubled, a great example of negative operating leverage.  Good news: HDP’s variable margin on services turned positive, which means they don’t have to give away consulting services as much as they did last year.  Wall Street was pleased.

Lattice Engines

In VentureBeat, Barry Levine kills two birds with one stone, touts the GrowthBeat Summit and Lattice Engines’ new features.  One assumes the latter sponsors the former.


NoSQL vendor MarkLogic secures a generous $102 million Series F round.

Oracle Analytics

Oracle announces spatial and graph analytics for Big Data. (h/t Oliver Vagner)


Pentaho announces integration with Apache Spark, enabling orchestration of Spark jobs.  Coverage herehere, here, herehere, here, and here.  Reporting this story, Alex Woodie trolls another spurious Spark “concern.”

Predixion Software

Predixion’s Marcom people show they’ve heard about IoT.

Wolfram Research

In VentureBeat, Jordan Novet reviews Wolfram’s new image identification tool, which leverages deep learning.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.