Gartner’s 2016 MQ for Advanced Analytics Platforms

This is a revised and expanded version of a story that first appeared in the weekly roundup for February 15.

Gartner publishes its 2016 Magic Quadrant for Advanced Analytics Platforms.   You can get a free copy here from RapidMiner (registration required.)  The report is a muddle that mixes up products in different categories that don’t compete with one another, includes marginal players, excludes important startups and ignores open source analytics.

Other than that, it’s a fine report.

The advanced analytics category is much more complex than it used to be.  In the contemporary marketplace, there are at least six different categories of software for advanced analytics that are widely used in enterprises:

  • Analytic Programming Languages (e.g. R, SAS Programming Language)
  • Analytic Productivity Tools (e.g. RStudio, SAS Enterprise Guide)
  • Analytic Workbenches (e.g. Alteryx, IBM Watson Analytics, SAS JMP)
  • Expert Workbenches (e.g. IBM SPSS Modeler, SAS Enterprise Miner)
  • In-Database Machine Learning Engines (e.g. DBLytix, Oracle Data Mining)
  • Distributed Machine Learning Engines (e.g. Apache Spark MLlib, H2O)

Gartner appears to have a narrow notion of what an advanced analytics platform should be, and it ignores widely used software that does not fit that mold.  Among those evaluated by Gartner but excluded from the analysis: BigML, Business-Insight, Dataiku, Dato, H2O.ai, MathWorks, Oracle, Rapid Insight, Salford Systems, Skytree and TIBCO.

Gartner also ignores open source analytics, including only those vendors with at least $4 million in annual software license revenue.  That criterion excludes vendors with a commercial open source business model, like H2O.ai.  Gartner uses a similar criterion to exclude Hortonworks from its MQ for data warehousing, while including Cloudera and MapR.

Changes from last year’s report are relatively small.  Some detailed comments:

— Accenture makes the analysis this year, according to Gartner, because it acquired Milan-based i4C Analytics, a tiny little privately held company based in Milan, Italy.  Accenture rebranded the software assets as the Accenture Analytics Applications Platform, which Accenture positions as a platform for custom solutions.  This is not at all surprising, since Accenture is a consulting firm and not a software vendor, but it’s interesting to note that Accenture reports no revenue at all from software licensing;  hence, it can’t possibly satisfy Gartner’s inclusion criteria for the MQ.  The distinction between software and services is increasingly muddy, but if Gartner includes one services provider on the analytics MQ it should include them all.

Alpine Data Labs declines a lot in “Ability to Deliver,” which makes sense since they appear to be running out of money (*).  Gartner characterizes Alpine as “running analytic workflows natively within Hadoop”, which is only partly true.  Alpine was originally developed to run on MPP databases with table functions (such as Greenplum and Netezza), and has ported some of its functions to Hadoop.  The company has a history with Greenplum Pivotal and EMC Dell, and most existing customers use the product with Greenplum Database, Pivotal Hadoop, Hawq and MADlib, which is great if you use all of those but otherwise not.  Gartner rightly notes that “the depth of choice of algorithms may be limited for some users,” which is spot on — anyone not using Alpine with Hawq and MADlib.

(*) Of course, things aren’t always what they appear to be.  Joe Otto, Alpine CEO, contacted me to say that Alpine has a year’s worth of expenses in the bank, and hasn’t done any new venture rounds since 2013 “because they haven’t needed to do so.”  Joe had no explanation for Alpine’s significantly lower rating on both dimensions in Gartner’s MQ, attributing the change to “bias”.  He’s right in pointing out that Gartner’s analysis defies logic.

Alteryx declines a little, which is surprising since its new release is strong and the company just scored a pile of venture cash.  Gartner notes that Alteryx’ scores are up for customer satisfaction and delivering business value, which suggests that whoever it is at Gartner that decides where to position the dots on the MQ does not read the survey results.  Gartner dings Alteryx for not having native visualization capabilities like Tableau, Qlik or PowerBI, a ridiculous observation when you consider that not one of the other vendors covered in this report offers visualization capabilities like Tableau, Qlik or PowerBI.

Angoss improves a lot, moving from Niche to Challenger, largely on the basis of its WPL-based SAS integration and better customer satisfaction.  Data prep was a gap for Angoss, so the WPL partnership is a positive move.

— Dell: Arguing that Dell has “executed on an ambitious roadmap during the past year”, Gartner moves Dell into the Leaders quadrant.   That “execution” is largely invisible to everyone else, as the product seems to have changed little since Dell acquired Statistica, and I don’t think too many people are excited that the product interfaces with Boomi.  Customer satisfaction has declined and pricing is a mess, but Gartner is all giggly about Boomi, Kitenga and Toad.  Gartner rightly cautions that software isn’t one of Dell’s core strengths, and the recent EMC acquisition “raises questions” about the future of software at Dell.  Which raises questions about why Gartner thinks Dell qualifies as a Leader in the category.

FICO fades for no apparent reason.  I’m guessing they didn’t renew their subscription.

IBM stays at about the same position in the MQ.  Gartner rightly notes the “market confusion” about IBM’s analytics products, and dismisses yikyak about cognitive computing.  Recently, I spent 30 minutes with one of the 443 IBM vice presidents responsible for analytics — supposedly, he’s in charge of “all analytics” at IBM — and I’m still as confused as Gartner, and the market.

— KNIME was a Leader last year and remains a Leader, moving up a little.  Gartner notes that many customers choose KNIME for its cost-benefit ratio, which is unsurprising since the software is free.  Once again, Gartner complains that KNIME isn’t as good as Tableau and Qlik for visualization.

Lavastorm makes it to the MQ this year, for some reason.  Lavastorm is an ETL and data blending tool that does not claim to offer the native predictive analytics that Gartner says are necessary for inclusion in the MQ.

Megaputer, a text mining vendor, makes it to the MQ for the second year running despite being so marginal that they lack a record in Crunchbase.  Gartner notes that “Megaputer scores low on viability and visibility and there is a lack of awareness of the company outside of text analytics in the advanced analytics market.”  Just going out on a limb, here, Mr. Gartner, but maybe that’s your cue to drop them from the MQ, or cover them under text mining.

Microsoft gets Gartner’s highest scores on Completeness of Vision on the strength of Azure Machine Learning (AML) and Cortana Analytics Suite.  Some customers aren’t thrilled that AML is only available in the cloud, presumably because they want hackers to steal their data from an on-premises system, where most data breaches happen.  Microsoft’s hybrid on-premises cloud should render those arguments moot.  Existing customers who use SQL Server Analytic Services are less than thrilled with that product.

Predixion Software improves on “Completeness of Vision” because it can “deploy anywhere” according to Gartner.  Wut?  Anywhere you can run Windows.

Prognoz returns to the MQ for another year and, like Megaputer, continues to inspire WTF? reactions from folks familiar with this category.  Primarily a BI tool with some time-series and analytics functionality included, Prognoz appears to lack the native predictive analytics capabilities that Gartner says are minimally required. 

RapidMiner moves up on both dimensions.  Gartner recognizes the company’s “Wisdom of Crowds” feature and the recent Series C funding, but neglects to note RapidMiner’s excellent Hadoop and Spark integration.

SAP stays at pretty much the same place in the MQ.  Gartner notes that SAP has the lowest scores in customer satisfaction, analytic support and sales relationship, which is about what you would expect when an ankle-biter like KXEN gets swallowed by a behemoth like SAP, where analytics go to die.

SAS declines slightly in Ability to Deliver.  Gartner notes that SAS’ licensing model, high costs and lack of transparency are a concern.  Gartner also notes that while SAS has a loyal customer base whose members refer to it as the “gold standard” in advanced analytics, SAS also has the highest percentage of customers who have experienced challenges or issues with the software.

Forrester “Wave” for Predictive Analytics

Last week, Forrester published its 2015 “Wave” report for Big Data Predictive Analytics Solutions.  You can pay $2,495 and buy it directly from Forrester (here), or you can get the same report for free from SAS (here).

The report is inaptly named, as it commingles software that scales to Big Data (such as Alpine Chorus) with software that does not scale (such as Dell Statistica.)  Nor does Big Data capability appear to impact the ratings; otherwise Alpine and Oracle would have scored higher than they did, and SAP would have scored lower.  IBM SPSS alone does not scale without Netezza or BigInsights; SAS only scales if you add one of its distributed in-memory back ends.  These products aren’t listed among the evaluated software components.

Also, Forrester seriously needs to hire an editor.  Alteryx does not currently offer software branded as “Alteryx Analytics”, nor does SAS currently offer a bundle called the “SAS Analytics Suite.”

Forrester previously published this wave in 2013; key changes since then:

  • Among the Leaders, IBM edged past SAS for the top rating.
  • SAP’s rating did not change but its brand presence improved considerably, which demonstrates the uselessness of brand presence as a measure of value.
  • Oracle showed up at the beauty show this time, and improved its position slightly.
  • Statistica’s rating did not change, but its brand presence improved due to the acquisition by Dell.  (See SAP, above).  Shockingly, the addition of “Toad Data Point” to the Dell/Statistica solution did not move the needle.
  • Angoss improved its ratings and brand strength slightly.
  • TIBCO and Salford switched their analyst relations budgets from Forrester to Gartner and are gone from this report.
  • KXEN and Revolution Analytics are also gone due to acquisitions.  Interestingly, the addition of KXEN to SAP had no impact on SAP’s ratings, thus demonstrating that two plus zero is still two.
  • RapidMiner, Alteryx, FICO, Alpine, KNIME and Predixion are all new to the report.

Gartner issued its “Magic Quadrant” back in February; the comparisons are interesting:

  • KNIME is a “leader” in Gartner’s view, while Forrester considers the product to be decidedly mediocre.  Seems to me that Forrester has it about right.
  • Oracle did not participate in the Gartner MQ.
  • RapidMiner, a “leader” in the Gartner MQ, scores very well on Forrester’s “Current Offering” axis, but less well on “Strategy.”   This strikes me as a good way for Forrester to sell strategy consulting.
  • Microsoft and Alpine landed in Gartner’s Visionary quadrant but scored relatively low in Forrester’s assessment.  Both vendors have appealing strategies, and need to roll up their sleeves to deliver.
  • Predixion trails the pack in both reports.  Reminds me of high school gym class.

Forrester’s methodology places more weight on the currently available software, while Gartner places more emphasis on the vendor’s “vision.”  Vision is certainly important to consider when selecting a software vendor, but leadership tends to be self-sustaining; today’s category leaders are likely to be tomorrow’s category leaders, except when markets are disrupted — in which case analysts are rarely able to pick winners.

More Comments on Microsoft + Revolution Analytics

My inbox continues to fill with Google Alerts about Microsoft’s announced purchase of Revolution Analytics — too numerous to link.

Most of these stories simply repackage the Microsoft announcement.

Clint Boulton of the WSJ’s CIO Journal writes one of the best analyses:

Microsoft is betting on the timeliness of its acquisition as more businesses adopt analytics. Revolution’s software helps companies use R, an open source programming language that more than two million programmers use daily to build predictive models. R is popular among university computer science students, many of whom continue to use it in their careers as data scientists.

Data scientists who extract data from of a data warehouse or Hadoop processing system, use R to slice and dice it for insights, and visualize the results. But businesses analyzing financial, social media and other data often need to scale the analytics across clusters of computers.

Several analysts pass along the factoid that two million people use R.   The truth is that nobody has any idea how many people use R; we don’t even know how many have downloaded the software.  The New York Times pointed out the difficulty in its piece five years ago:

While it is difficult to calculate exactly how many people use R, those most familiar with the software estimate that close to 250,000 people work with it regularly.

It’s possible that R has gained 1,750,000 users in the interceding five years.  It’s also possible that R has gained 10,000,000 users.  “Those most familiar with the software” are simply guessing.

While most analysts are neutral to positive on Microsoft’s move, Mr. Dan Woods takes a contrary view.  In an article published in Forbes and cross-posted on multiple platforms, Mr. Woods argues that Microsoft was wrong to buy Revolution Analytics, and instead should buy Tibco.   (That is the implication of his argument that Microsoft should “emulate” Tibco, since the only way to “emulate” Tibco is to own the clump of software Tibco packages up as TERR.)

Mr. Woods is a “content specialist”, as freelance writers call themselves today, and his expertise in analytics is exemplified by his most recent book, Wikis for Dummies, published in 2007.  One suspects that the private equity firm that acquired Tibco in September is peddling the pieces, and has engaged “content specialists” to bang the drum.

Mr. Woods gets two things right.  It’s true that R is a mess, and it is also true that the GPL license makes R difficult to commercialize.  R’s messiness is a byproduct of crowdsourced development; it is a feature to its devotees and a bug for everyone else.  (For those who simply cannot tolerate R’s messiness there is a simple solution: use Python.)  Under the GPL license, any enhancements become part of the free distribution, so if you distribute a product built with R you must share the source code of your product as well.

At the crux of his argument, though, Mr. Woods gets it wrong:

Revolution Analytics has made a business, like many open source-based companies, of supporting Open Source R.

This is factually incorrect.  Revolution only recently started to offer a consulting service for open source R users; for most of its history, its business was built around Revolution R Enterprise, a commercially supported enhanced R distribution.  This is not a trivial distinction.  Cloudera Hadoop, for example, is based on Apache Hadoop, but it is not the same thing; while many enterprises use commercially supported Hadoop distributions (from vendors like Cloudera, Hortonworks or MapR), hardly anyone uses open source Apache Hadoop in production.

The same is true for R; while many enterprises have an issue using open source R, they are willing to deploy commercially supported R distributions (such as Oracle R or Revolution R).  This is the business Microsoft enters by acquiring Revolution Analytics.

Regarding Mr. Woods’ point about the need to rebuild R from the ground up, that is neither possible nor necessary.  The GPL license prevents anyone from “rebuilding” R as a commercial venture; if anyone “rebuilds” the language it will be the open source development team itself.

In any case, one need not “make R scale” — one need only provide an R API to other platforms (such as Apache Spark or dbLytix) that can scale, so that R users can interface with them.   This is the approach taken by Revolution Analytics’ ScaleR software, which is actually written in C, but includes an interface from the R programming language.  By building this component into Azure, Microsoft can offer those who use R locally a scaleable back end.

Update: Mr. Woods doubles down here.