Updated and bumped April 11, 2014. The emergence of Apache Spark is a key development for Big Analytics in 2014.   Spark, a top-level Apache project, is an open source distributed computing framework for advanced analytics in Hadoop.  Originally developed as a research project at UC Berkeley’s AMPLab, the project achieved incubator status in Apache in June 2013 and […]


0xdata (“Hexa-data”) is a small group of smart people from Stanford and Silicon Valley with VC backing and an open source software project for advanced analytics (H2O).  Founded in 2011, 0xdata first appeared on analyst dashboards in 2012 and has steadily built a presence in the data science community since then. 0xdata operates on a […]


A colleague asks: can we automate predictive modeling? How we answer the question depends on the context.   Consider the two variations on the question below, with more precise wording: Can we completely eliminate the need for expertise in predictive modeling — so that an “ordinary business user” can do it? Can we make expert […]


Dell announced this morning that it has acquired Statsoft, a privately held company that distributes Statistica, a suite of software for statistics and data mining.   Terms of sale were not announced. Founded by academics in 1984, Statsoft has developed a loyal following at the low end of the analytics market, where it offers a […]


This is the second of a three-part series on the current state of play for machine learning in Hadoop.  Part One is here.  In this post, we cover open source options. As we noted in Part One, machine learning is one of several technologies for analytics; the broader category also includes fast queries, streaming analytics […]


Much has changed since I last blogged on this subject a year ago (here and here).  This is the first of a three-part blog covering the current state of play for machine learning in Hadoop.  I use the term “machine learning” deliberately, to refer to tools that can learn from data in an automated or […]


Funding for analytic ventures remained robust in January, with 17 significant funding transactions and three acquisitions.   Key themes: Outcomes-based medicine and health care Vertical solutions for the energy industry Solutions for risk management Mobile analytics, including location-based targeting and app metrics Social media sentiment analysis Graph engines (and solutions based on graph engines) In-memory […]


In every enterprise that uses analytics, there are a few power users who need the most advanced tools all of the time, and an army of casual users who need to do simple analysis now and then.  For the latter group, cloud-based analytics make perfect sense; users get the tools they need when they need […]


Today, SAS announced 2013 revenue of $3.02 billion, up 5.2% from 2012.  Reported revenue from “cloud-based” solutions grew by 20%; most of this revenue comes from SAS Solutions On Demand, a private hosting service. SAS claims more than 1,400 “sites” for SAS Visual Analytics, an impressive figure but well short of SAS’ goal of 2,000 […]