Big Analytics Roundup (March 23, 2015)

This week, Spark Summit East produced a deluge of news and analysis on Apache Spark and Databricks.  Also in the news: a couple of ventures landed funding, SAP released software and SAS soft-launched something new for SAS Visual Analytics.

Analytic Startups

Venture Capital Dispatch on WSJ.D reports that Andreeson Horowitz has invested $7.5 million in AMPLab spinout Tachyon Nexus.  Tachyon Nexus supports the eponymous Tachyon project, a memory-centric storage layer that runs underneath Apache Spark or independently.

Social media mining venture Dataminr pulls $130 million in “D” round financing, demonstrating that the real money in analytics is in applications, not algorithms.

Apache Flink

On the Flink project blog, Fabian Hueske posts an excellent article that describes how joins work in Flink.

Apache Spark

ADTMag rehashes the tired debate about whether Spark and Hadoop are “friends” or “foes”.  Sounds like teens whispering in the hallways of Silicon Valley High.  Spark works with HDFS, and it works with other datastores; it all depends on your use case.  If that means a little less buzz for Hadoop purists, get over it.

To that point, Matt Kalan explains how to use Spark with MongoDB on the Databricks blog.

A paper published by a team at Berkeley summarizes results from Spark benchmark testing, draws surprising conclusions.

In other commentary about Spark:

  • TechCrunch reports on the growth of Spark.
  • TechRepublic wonders if anything can dim Spark.
  • InfoWorld lists five reasons to use Spark for Big Data.

In VentureBeat, Sharmila Mulligan relates how ClearStory Data’s big bet on Spark paid off without explaining the nature of the payoff.  ClearStory has a nice product, but it seems a bit too early for a victory lap.

On the Spark blog, Justin Kestelyn describes exactly-once Spark Streaming with Apache Kafka, a new feature in Spark 1.3.

Databricks

Doug Henschen chides Ion Stoica for plugging Databricks Cloud at Spark Summit East, hinting darkly that some Big Data vendors are threatened by Spark and trying to plant FUD about it.  Vendors planting FUD about competitors that threaten them: who knew that people did such things?  It’s not clear what revenue model Henschen thinks Databricks should pursue; as Hortonworks’ numbers show, “contributing to open source” alone is not a viable business model.  If those Big Data vendors are unhappy that Databricks Cloud competes with what they offer, there is nothing to stop them from embracing Spark and standing up their own cloud service.

In other news:

  • On the Databricks blog, the folks from Uncharted Software describe PanTera, cool visualization software that runs in Databricks Cloud.
  • Rob Marvin of SD Times rounds up new product announcements from Spark Summit East.
  • In PCWorld, Joab Jackson touts the benefits of Databricks Cloud.
  • ConsumerElectronicsNet recaps Databricks’ announcement of the Jobs feature for Databricks Cloud, plus other news from Spark Summit East.
  • On ZDNet, Toby Wolpe reviews the new Jobs feature for production workloads in Databricks Cloud.
  • On the Databricks blog, Abi Mehta announces that Tresata’s TEAK application for AML will be implemented on Databricks Cloud.  Media coverage here, here and here.

Geospatial

MemSQL announced geospatial capabilities for its distributed in-memory NewSQL database.

J. Andrew Rogers asks why geospatial databases are hard to build, then answers his own question.

RapidMiner

Butler Analytics publishes a favorable review of RapidMiner.

SAP

SAP released a new on-premises version of Lumira Edge for visualization, adding to the list of software that is not as good as Tableau.  SAP also released Predictive Analytics 2.0, a product that marries the toylike SAP Predictive Analytics with KXEN InfiniteInsight, a product acquired in 2013.  According to SAP, Predictive Analytics 2.0 is a “single, unified analytics product” with two work environments, which sounds like SAP has bundled two different code bases into a marketing bundle with a common datastore.  Going for a “three-fer”, SAP also adds Lumira Edge to the bundle.

SAS

American Banker reports that SAS has “launched” SAS Transaction Monitoring Optimization for AML scenario testing; in this case, “launch”, means marketing collateral is available.  The product is said to run on top of SAS Visual Analytics, which itself runs on top of SAS LASR Server, SAS’ “other” distributed in-memory platform.

Big Analytics Roundup (March 2, 2015)

Here is a roundup of some recent Big Analytics news and analysis.

General

  • SiliconAngle covers the Big Data money trail.

Apache Spark

  • Curt Monash writes about Databricks and Spark on his DBMS2 blog.
  • On the Databricks blog, Dave Wang summarizes Spark highlights from Strata + Hadoop World.
  • In this post, Hammer Lab describes how to monitor Spark with Graphite and Grafana.
  • Cloudera announces Hive on Spark beta.
  • InfoWorld covers Spark’s planned support for R in Release 1.3.
  • Qubole announces Spark as a Service.

 Dato/GraphLab

  • Dato announces new version of GraphLab Create.

 H2O

  • From Strata + Hadoop World, Prithvi Pravu talks about using H2O.
  • Also from Strata, here is Cliff Click’s presentation on H2O, Spark and Python.
  • On the H2O blog, Arno Candel publishes a performance tuning guide for H2O Deep Learning.

 

 

Smart Money: YTD Funding for Analytics Doubles vs. 2013

Funding for analytics startups continued at a torrid pace in the second quarter; 140 announced investments totaled $1.03 billion, compared to 137 investments totaling $657 million in the second quarter of 2013.  Year to date, investment in analytic ventures is up 97% versus the first two quarters of last year.  (All data sourced from Crunchbase).

Screen Shot 2014-07-13 at 4.06.49 PM

 

Eleven mezzanine rounds raised a total of $412m, 40% of the total amount invested in the sector.  There were two private equity investments:

  • Attensity raised $90m on the strength of its natural language processing, text mining and sentiment analysis solution.
  • Hadoop distributor MapR raised $80m from Google Capital, Qualcomm Ventures, Redpoint Ventures, New Enterprise Associates, Mayfield Fund and Lightspeed Venture Partners.  MapR also received a $30m loan from Silicon Valley Bank.

In other notable mezzanine rounds:

  • BPM solution vendor Tidemark raised $32m in an E round funded by Tenaya Capital, Redpoint Ventures, Andreeson Horowitz, Greylock Partners and Silicon Valley Bank.
  • Social media platform Sprinklr landed $40m in D round funding from Intel Capital, Battery Ventures and Iconiq Capital.
  • BI vendor SiSense raised $30m in a C round from Opus Capital, Genesis Partners, Battery Ventures and DFJ Growth.
  • HR solutions provider Visier raised $25.5m in a C round from Summit Partners, Foundation Capital and Adams Street Partners.
  • Predictive marketing platform provider AgilOne received $25m in C round funding from Sequoia Capital, Four Rivers Group, Next World Capital, Tenaya Capital and Mayfield Fund.

Nineteen “B” rounds raised $282m.  Top recipients include:

  • Credit scoring specialist Kreditech raised $40m from Point Nine Capital, Blumberg Capital and Varde Partners.
  • Data platform Krux landed $35m in a round led by SAP Capital.
  • Databricks capitalized on interest in Apache Spark by raising $33m from New Enterprise Associates and Andreeson Horowitz.
  • Social listening vendor Brandwatch raised $22m from Nauta Capital and Highland Capital Partners.
  • Context Relevant, a firm that offers diverse horizontal solutions with embedded analytics, landed $21m in funding from a group led by Formation 8.

Nineteen “A” rounds raised $126m.  Notable recipients include:

  • Evidence-based medicine provider Orange Health Solutions disclosed an equity investment of $22.5m from unknown investors.
  • 6Sense, developer of an eponymous sales and marketing intelligence platform, raised $12m from Battery Ventures, Venrock and Silicon Valley Bank.
  • Health analytics platform vendor Aver Informatics landed $8.5 from GE Ventures and Drive Capital.

Forty pre-venture rounds — Seed, Angel, Grants and Crowdsourcing — raised $30m.  The largest of these, at $2.65m, went to Farmeron, developer of a cloud-based analytics app for farm business management.

For this analysis, I classify firms as analytics providers if the Crunchbase company_category_list field contains the word “Analytics”.  This is a broad definition that includes:

  • Analytic data platforms (MapR)
  • Analytic tools providers (RapidMiner)
  • Analytic service providers (Palantir)
  • Business solutions with embedded analytics (AgilOne)

Of course, the quality of this classification depends on the quality of the data in Crunchbase, which is a work in progress.

Smart Money: More Funding for Analytics

Funding for analytic ventures remained robust in January, with 17 significant funding transactions and three acquisitions.   Key themes:

  • Outcomes-based medicine and health care
  • Vertical solutions for the energy industry
  • Solutions for risk management
  • Mobile analytics, including location-based targeting and app metrics
  • Social media sentiment analysis
  • Graph engines (and solutions based on graph engines)
  • In-memory SQL engines

All funding news via Crunchbase.

Funding

Health Catalyst led the way with $41 million in Series C funding.   Health Catalyst offers a solution stack consisting of a proprietary data warehouse optimized for electronic medical records, plus analytic applications designed to support outcomes-based health care.

Other transactions greater than $1 million include:

MemSQL, provider of a high performance in-memory distributed database, raised $35 million in a Series B round.

— Still in stealth mode, marketing analytics provider OrigamiLogic closed on $15 million in Series B funding.

— Kreditech scored $15 million in debt financing.  Kreditech uses machine learning and Big Data to offer credit scoring for microlending.

— Radius closed on $13 million in Series B funding.  Radius supports B2B targeted marketing and lead generation for small businesses.

— Smart grid analytics provider AutoGrid landed $12.8 million in Series C funding.

— GNS Healthcare leverages Bayesian Networks and Monte Carlo Simulation to deliver solutions for outcomes-based medicine to hospitals, health insurance plans, pharmaceutical companies and other entities in the health care delivery chain.  GNS completed $10 million in Series B financing.

— Simple Energy raised $6 million in Series B funding.  Simple Energy offers utilities services to improve customer interactions through microtargeting and social gaming.

— Binary Fountain, provider of software integrating social sentiment analysis with BPM, raised $5.7 million.

— 4C Insights integrates social media sentiment analysis with public data to support media planning and targeting.   The firm raised $5 million in Series B funding.

— Kontagent secured $4.8 million in venture funding.  Kontagent offers mobile analytic solutions to mobile app developers and marketers.

— Offshore analytic services provider Axtria received $4.8 million in venture funding.

— Enigma Technologies raised $4.5 million in Series A funding.  Enigma provides a platform for the analysis of public data that includes a repository and directory to sources, plus tools for search, export and simple analytics.

— Lumiata raised $4 million in Series A funding.  Lumiata leverages graph engine technology to deliver evidence-based predictions to medical practitioners.

— BI vendor Chartio received $2.2 million in venture funding

Bottlenose, purveyor of dashboard and insight tools for social sentiment analysis, raised $1.1 million in debt financing.

Geofeedia, a provider of open source location-based social media mining tools, received 1.25 million in Series A funding.

Acquisitions

There were three acquisitions of note; purchase prices were not disclosed.

— yp, the corporate successor to AT&T Interactive and AT&T Advertising Solutions, acquired Sense Networks on January 6.   Sense Networks uses predictive analytics to drive location-based behavioral targeting for mobile ad platforms.

— Pinterest acquired VisualGraph on January 6.  VisualGraph, a two-man operation, has developed a distributed in-memory visual search engine.

— Apigee, an API management company, acquired InsightsOne on January 8.   InsightsOne offers cloud-based infrastructure for predictive analytics based on Hadoop, plus an in-memory graph engine.

Smart Money: Venture Capital for Analytics 2013

Thanks to Crunchbase’s downloadable database, we can report that in 2013 investors poured more than $2 billion into Analytic startups, up 38% from 2012.  Crunchbase reports 2013 funding for Analytics ventures more than five times greater than in 2009.

Source: Crunchbase
Source: Crunchbase

Palantir led the pack in new funding, going to the well twice, in October and December, to raise a total of $304m based on a valuation of $9b.  As a point of reference, at 4X revenue, industry leader SAS is worth about $12b.

Funding flowed to companies that build advanced analytics into focused vertical or horizontal solutions.  Examples include:

Investors paid special attention to vendors who specialize in social media analytic platforms:

Capital also flowed to companies offering general-purpose software, platforms and services for analytics, including:

Investors continue to fund startups offering easy-to-use interfaces for the business user, including:

Top investors in Analytics for 2013 include:

Clearly, investors are placing bets on a robust future for analytics.