Big Analytics Roundup (March 30, 2015)

Lots of Spark news this week, following last week’s Sparkalanche, plus some other non-Spark news just to show that Big Analytics isn’t entirely about Spark.

Alteryx

  • In IntelligentHQ, Maria Fonseca interviews Alteryx COO George Mathew, argues that analytics is for people.  Left unanswered: who else it could be for.

Analytic Startups

  • Analytics vendor Ayasdi lands a $55 million “C” round.
  • Localytics, which specializes in analytics for mobile and web apps, secures a $35 million “D” round.

Apache Drill

  • MicroStrategy announces certification of Apache Drill with MicroStrategy Analytics Enterprise Platform.

Apache Spark

Analysis

  • IBM Big Data “evangelist” James Kobelius confirms that IBM has no idea what to do with Spark.
  • In TechRepublic, Matt Asay argues that Hadoop won’t disappear just because it’s slow, knocking over several straw men in the process.   On readwrite, he makes similar points; and on InfoWorld, he goes for the hat trick.
  • In InfoWorld, Platfora’s Peter Schlampp offers five reasons why Spark is the next big thing.

Applications

  • On the Cloudera blog, Sam Shuster of Edmunds.com describes a dashboard built with Spark Streaming, SparkOnHbase and Morphlines.
  • In InfoQ, Srini Penchikala of Pinterest explains why he’s using Spark Streaming, Kafka and MemSQL for a real-time application.

Data Science

  • On the Databricks blog, Joseph Bradley writes an excellent article on Topic Modeling with Spark’s new Latent Dirichlet Allocation capability.

Developer

  • On the Databricks blog, Michael Armbrust describes new Spark SQL features in Spark 1.3
  • On Slideshare, Vida Ha and Holden Karau share tips for writing better Spark programs; video here.

Deep Learning

  • Tomasz Malisiewicz of Vision.ai blogs on Deep Learning versus Machine Learning versus Pattern Recognition.

RapidMiner

  • RapidMiner publishes a white paper on code-free analytics in Hadoop, and another on Hadoop security.

Big Analytics Roundup (March 2, 2015)

Here is a roundup of some recent Big Analytics news and analysis.

General

  • SiliconAngle covers the Big Data money trail.

Apache Spark

  • Curt Monash writes about Databricks and Spark on his DBMS2 blog.
  • On the Databricks blog, Dave Wang summarizes Spark highlights from Strata + Hadoop World.
  • In this post, Hammer Lab describes how to monitor Spark with Graphite and Grafana.
  • Cloudera announces Hive on Spark beta.
  • InfoWorld covers Spark’s planned support for R in Release 1.3.
  • Qubole announces Spark as a Service.

 Dato/GraphLab

  • Dato announces new version of GraphLab Create.

 H2O

  • From Strata + Hadoop World, Prithvi Pravu talks about using H2O.
  • Also from Strata, here is Cliff Click’s presentation on H2O, Spark and Python.
  • On the H2O blog, Arno Candel publishes a performance tuning guide for H2O Deep Learning.