Forrester’s 2018 PAML “Waves”

Forrester just published two “Wave” reports for predictive analytics and machine learning. The first, covering “multi-modal” solutions, is available here for free. A second report, covering notebook-based solutions, is available here (registration required.)

Forrester plans to publish a third report, covering automated machine learning vendors, in 2019.

Kudos to Forrester for understanding the diversity of the data science tools market. Software with a visual interface does not compete with code-centric software — it appeals to a different class of users. Instead of trashing code-based tools as “too hard to use,” Forrester recognizes that they belong in a separate category.

Let’s take a quick look at how vendors fared in each report

Multimodal Predictive Analytics and Machine Learning Platforms

Here’s the “Wave”:

My comments:

— SAS did well. Forrester gives a glowing review to SAS Visual Data Mining and Machine Learning. That squares with what I hear from the few customers willing to pay for it.  Use a wizard to automatically train a model is a bit of a stretch, though. VDM/ML supports automated parameter tuning, but data engineering, feature engineering, experiment management, model evaluation, and model selection are all manual tasks. Oh, and for model management, you need to license another SAS product.

— Forrester’s assessment of IBM makes less sense to me. Watson Studio is a quodlibet of previously available services, cobbled together and pushed out the door just in time for analyst review season. Those “SPSS-inspired” workflows look an awful lot like — wait for it — SPSS, which IBM did not submit for review because it’s so done. IBM Watson Studio is only available on IBM Cloud, everyone’s fifth choice in cloud platforms, which makes it seem more like a niche product. Does anyone actually pay for Watson Studio? I’ve only run into it when some Blue customer gets free credits with an IBM enterprise agreement.

— Forrester notes that RapidMiner helps 380,000 users. If only more of them paid for the privilege.

— Angoss (Datawatch), FICO, KNIME, and SAP all fell out of the “Leaders” category, which was getting pretty crowded in last year’s report. All fell victim to Forrester’s changing metrics.

— TIBCO remains in the “Strong Performers” category, but Forrester rates its current offering much lower than it rated Alpine and Statistica, which TIBCO acquired last year. This demonstrates the maxim that in business, one plus two doesn’t always add up to three.

Dataiku scored about the same this year as last.

— Microsoft took a big hit, falling from “Strong Performer” to “Contender,” with markedly lower ratings on both dimensions. Bit of a puzzler, IMHO, the MSFT offering seems better than that.

MathWorks joins the Wave this year and lands about where you would expect.

World Programming and Salford Systems trail the pack. SAS has not yet litigated the former out of business. Minitab acquired Salford last year. I can remember using Minitab back in the 1970s. Yeah, I’m that old.

Forrester did not rate Alteryx.

Notebook-Based Predictive Analytics and Machine Learning Solutions

Here’s the “Wave”:

Note: this is the updated “Wave” published by Forrester on September 7.

Most of the vendors in this wave are new to Forrester. My comments:

Domino Data Lab leads the pack, and rightly so. Domino invented this category and leads in every respect.

— Forrester’s assessment of Oracle as a leader seems, well, aspirational. Customer adoption, per the detailed tables, is zero. Insiders from DataScience.com, which Oracle acquired recently, throw shade at the product’s stability and maturity. Presumably, Oracle has the deep pockets to fix the product and make it work. Even so, it’s not nearly as good as Domino; I’d share a detailed feature/function analysis, but it would take more than a paragraph. Oracle lacks Domino’s street cred with the data science community, and the folks in Oracle Cloud who drove the acquisition don’t talk to the folks in Oracle Data Mining, who have actual customers and experience in the field.

— For this wave, Forrester did not evaluate H2O.ai‘s Driverless AI. Forrester wasn’t impressed with Sparkling Water and Flow UI. Enterprises looking for a notebook-based PaML solution will find better solutions from the other vendors in this evaluation. Ooh, burn.

— My former colleagues at Cloudera should be pleased with their positioning in the middle of the pack. Databricks did well, too. Forrester dings Cloudera and Databricks for using proprietary IDEs instead of Jupyter. I’m sorry, but the folks at Cloudera and Databricks aren’t stupid — they understand that Jupyter isn’t suitable for production software development. Don’t @ me.

Civis Analytics‘ main asset is its founders’ political connections. Forrester rightly notes that Civis is not yet the platform for everyone, though, as it is currently cloud only, it doesn’t support many machine learning frameworks out of the box, and Spark is still in the pipeline. That’s like saying dinner is ready, but we have no rolls or salad and the roast is still in the oven.

— It must sting Anaconda to score below OpenText, but there it is.

Google brings up the rear with Cloud DataLab which, according to Forrester, does little to improve data scientist productivity, such as through project capabilities, team collaboration features, and other modeling tools that are important criteria in this evaluation. Yeah, that’s about right.

Surprisingly, Forrester did not evaluate Amazon Machine Learning.

Advertisements

8 comments

  • Thomas, I always appreciate your perspective. Thank you for your leadership.

  • Hi Thomas,

    Your assessment of IBM Watson Studio is incorrect: it’s not based on any services available beforehand and it’s certainly not cobbled together. Development started at square one i.e. no code refactoring but directly coded to run cloud native (Kubernetes).

    And indeed IBM Cloud may be not everyone’s favourite, but should a Data Scientist/Company not chose the best product for his needs independent from where it runs? These days people buy services not products. I also haven’t a clue how my 4G works and were it runs. I just happen the have chosen that provider because it fits my need as a consumer.

    Note: I work for IBM and I know you have a history with the company. I love your directness, but please also be an independent consultant when it concerns ‘big blue’.

    • SV,

      Thanks for reading!

      Breaking down IBM Watson Studio:
      — IBM previously sold the data prep module as Data Refinery.
      — IBM previously sold the data science module as IBM Data Science Experience. (It still markets this module, with some differences, for on-premises implementation.)
      — IBM previously sold the Spark and SPSS flows branded as IBM Watson Machine Learning
      — The Deep Learning capabilities appear to be a new front end for existing APIs. (There’s nothing wrong with that, of course. UIs add value.)

      So, did IBM design the product from scratch and bring modules to market separately? I would believe that argument if there were a coherent UI across the modules and no overlapping functionality. As it is the impression the product leaves with me is exactly as I described — it’s a collection of different services bundled together and rebranded as “Watson.”

      Data scientists don’t get to choose a cloud platform in a vacuum. It’s increasingly rare for a data science team to choose a cloud platform that differs from the organization standard. Most data scientists will simply rule out IBM Watson Cloud when they learn it’s only available in IBM Cloud.

      FWIW, yes, I worked for IBM in 2011 and 2012, when IBM acquired Netezza. My experience with IBM was entirely positive and I left on my own. If IBM delivered an attractive product, I would say so.

      — Thomas

      • Thomas,

        Ah, Netezza, that where the days. One of the most beautiful pieces of technology ever made!

        But anyway,
        We life in the age of micro-services. It’s not because they ARE separate services and CAN be sold separately (which IBM indeed did), that there is no technical master plan behind it.

        The 3 main ml/ai services today that DO share a unified interface are Watson Studio (design tools), Watson Machine Learning (deployment) and Watson Knowledge Catalog (asset management). The latter service was also put in the top right corner of The Forrester Wave: Machine Learning Data Catalogs, Q2 2018. Those 3, together with the GUI’s on top of the Watson API’s all join hands in the Watson Studio UI.

        That companies discard the IBM cloud, even if a better service is provided over there, because, according to IT, it’s not their standard, says more about the company than IBM. Disallowing the business side to freely chose their tools means they seem to forget who is making the money.

      • Yes, Netezza was cool in 2010.

        So, you concede that IBM designed and marketed the components of IBM Watson Studio separately? Good, we have common ground there.

        IBM Watson Studio has four main modules: data prep, data science, machine learning, and data catalog. The data prep module (previously branded as Data Refinery) has a consistent UI for all of its functions. So does the Data Catalog.

        The Data Science module is a mosh of DSX and SPSS.

        The ML+AI module is a mosh of all sorts of things. Some of these have code-driven APIs and others use the workflows ported from SPSS. The UI is neither internally consistent nor consistent with the Data Refinery and Data Catalog UIs.

        Arguing that 3 out of some 20 submodules share the same UI is missing the point. In a well-designed product, all of the modules share a common UI.

        It’s extremely difficult for IBM to argue that IBM Watson Studio offers benefits to the data scientist that warrant deviating from a company standard. (Which is why IBM resorts to giving it away.) Most of the data science platforms on the market today are available on all three of the leading cloud platforms. Moreover, the native tools available from AWS, Azure, and Google are increasingly competitive. In certain areas, such as deep learning and AI, most data scientists prefer to work with AWS or Google, so there’s no real reason to consider the IBM offering.

  • AWS nor Google play in the PAML field, IBM is not part of the Notebook wave, it seems Forrester does not see them as competitors. Azure is way down in the PAML wave. Neither AWS, Google, nor Azure are present on the forester wave for machine learning data catalogs.

    I’m not working for Forrester, but I’m pretty sure they have there reasons for this.

  • Let’s distinguish between the world as Forrester sees it and the world as it is.

    — AWS introduced SageMaker in 2017. According to Forrester, they introduced it too late for inclusion in the Wave.
    — Azure Machine Learning Studio didn’t fare too well in Forrester’s analysis — unfairly so, IMHO.
    — Forrester only evaluated Google Cloud DataLab, not the rest of Google’s offerings.
    — Forrester did not evaluate IBM Data Science Experience, which is clearly a notebook-based offering

    The native cloud products are competitive but necessarily best-in-class.

    — SageMaker and the Google APIs are a good choice for the AI developer
    — AMLS is a good choice for business users

    But data science teams aren’t limited to cloud-native tools. All of the leading data science and machine learning tools run on AWS and Azure, and most run on GCP.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.