2018 in AI/ML

Well, 2018 is dead and gone. Time to take a look back at the year in AI/ML.

A reminder that I work for DataRobot. This is my personal blog. Opinions are mine.

On the Move

It’s hard to believe that Amazon Web Services introduced Amazon SageMaker just a year ago, but here we are. AWS moved aggressively to enhance the service with new native and partner capabilities. The service still targets the experienced AWS developer, but AWS can move upmarket if it chooses. Joyent CTO Brian Cantrill tweets:

Am waiting for the year that reInvent goes full Red Wedding, locking the doors and announcing that every attendee’s product or service is now a forthcoming AWS offering. Or maybe that was this year?

Reinforcement learning is among the new bits in SageMaker. If you want to push your toddler into an AI career, AWS announced DeepRacer, an autonomous model race car. Just in time for Christmas.

AWS also announced Amazon Forecast at re:Invent. Amazon Forecast is a managed service for time series analysis. As a global retailer, Amazon has a huge forecasting problem and skills to match. Marketing materials stress AWS’ deep experience in retail forecasting, a credential that would appeal to retailers if retailers wanted to do business with AWS. They don’t however, so AWS might want to shut up about its retailing chops.

Dataiku did not do well in Gartner’s 2018 MQ, dropping like a stone to the bottom of the Ability to Execute axis. Not quite the bottom: Dataiku did better than Teradata. “We beat Teradata” gives cold comfort when you note that Teradata deprecated Aster and exited the category.

To its credit, Dataiku responded to the issues that Gartner surfaced, adding a deployment API and containerized engines. Customer ratings in Gartner PeerInsights look strong, which bodes well for Dataiku’s position in the 2019 MQ.

Update: Dataiku announces a $101 million Series C round of venture capital. Iconiq Capital leads the round, with Alven Capital, Battery Ventures, Dawn Capital and FirstMark Capital also participating. Alven, Battery, and FirstMark all participated in Dataiku’s B round in 2017.

Databricks wants to be more than “the people who invented Apache Spark.” That’s a self-limiting value proposition: to the extent that Spark matures and stabilizes, customers are less likely to need Matei Zaharia to tune their RDDs.

In its first few years, Databricks was little more than Spark on AWS, and its main emphasis was on data engineering. Starting last year, the company pivoted towards the AI and machine learning market. This meant supporting a wider range of Java, Python, R, and Scala packages as well as Spark ML and Spark packages. Databricks also broadened its support for deep learning frameworks, including TensorFlow, MXNet, Keras, PyTorch, Caffe and Microsoft Cognitive Toolkit.

The pivot pays dividends. Earlier this year, Databricks scored a “Visionary” rating in Gartner’s 2018 MQ, thanks to its innovation and scalability. In June, Databricks introduced MLFlow for tool integration, experiment tracking, reproducibility, and model deployment. That’s a positive step towards an integrated data science and machine learning platform.

DataRobot (my employer) had a good year. The company released time series forecasting, model management, and model monitoring, among other things. In October, DataRobot closed a $100 million Series D round led by Sapphire Ventures and Meritech Capital Partners. Privately held startups don’t release financials, so the only way to tell which startups are going somewhere is to follow the money. Venture capitalists don’t back losers.

H2O.ai opened the champagne when Gartner released its 2018 MQ. Getting named to the Leader quadrant is a big deal, and it takes a lot of work to get there. Driverless AI is now available in all three cloud marketplaces. You can also install it on-premises on an NVIDIA GPU-accelerated box or an IBM “Minsky” server.

H2O.ai shipped 32 versions of Driverless AI this year, a fact that can be interpreted in more than one way. Agile sprints sound cool at hacker meetups, but enterprise customers don’t like to upgrade software every six days. Major product enhancements include GLM, time series, an alpha release of TensorFlow, and a text analytics capability that depends on TensorFlow, so I guess that’s in alpha too.

H2O World attracted “record audiences” when it played New York and London. It was the first time the show played those venues, so an audience greater than zero breaks the record.

Muddling Through

You have to give Alteryx credit: they know how to bang the cash register and collect money. Alteryx sellers delivered 50% revenue growth and a thousand new logos each quarter this year. That’s a great track record.

However, the average starting sale is small, and account expansion is minuscule. Sooner or later you run out of new logos to land.

In AI/ML, Alteryx did little with the Yhat assets it acquired last year. This is not too surprising. When your flagship product runs only on Windows and you acquire software that runs only on Linux, it can take time to consummate the marriage.

Speaking of marriage, it’s an open secret that Alteryx plans to partner with H2O.ai. Alteryx CEO Dean Stoecker will keynote H2O World SFO in February. We’ll see how that works out.

Google just muddling through? WTF? Google deprecated its Prediction API earlier this year. The AutoML product line will support vision, speech, and translation if it ever gets out of the beta. And Cloud DataLab isn’t winning any prizes. So right now Google doesn’t have a horse in the predictive analytics race.

Nobody cares about TPUs unless you can do something with them.

IBM shoved IBM Watson Studio out the door in time to make the key analyst reports. Forrester awarded IBM a top rating in the revamped “Wave” for MultiModal Predictive Analytics, which is nice. Gartner’s MQ isn’t available until February, so we’ll just have to wait and see what Gartner thinks. Gartner gives more weight to customer feedback, which has been a dumpster fire for IBM the last few MQs. It’s the main reason Gartner kicked IBM out of the “Leaders” quadrant earlier this year. Oh, the shame of it.

In my view, IBM Watson Studio exemplifies what IBM does best: taking old things and calling them new things. The service seems like a quodlibet of existing services, with an additional splash of a new SPSS that looks strikingly like the old SPSS. It’s available in IBM Cloud, everyone’s last choice in cloud platforms.

Incidentally, IBM peddles the line that they are the biggest contributor to Spark’s machine learning library. They might want to reconsider using that factoid. In the past two years, I can’t think of a single new feature added to Spark ML. And data scientists surveyed by KDNuggets in 2018 were less likely to use Spark than those polled in 2017.

Mathworks seems to be doing a better job at analyst relations. I don’t have much to say about Mathworks. Like SAS, it’s the object of desire for a cult of users who will give it up when you wrench it from their cold dead hands.

Microsoft shuffled the executive chairs in AI/ML leadership. This could mean something or it could mean nothing. Microsoft made a splash a couple of years ago when it introduced Azure Machine Learning Studio and acquired Revolution Analytics. Since then, all seems quiet in Redmond. The company has dozens of products and services for machine learning and AI, all of which seem to be managed in silos. There’s no coherent product strategy that bridges cloud and on-premises computing, which is surprising given the market strength of Azure.

RapidMiner launched new enhancements for data prep, feature engineering, and automated model training. Even so, and despite endorsements from Forrester and Gartner, the company hasn’t landed new venture capital in more than two years. It’s unclear why this is so. Some sources attribute it to the history of litigation with TIBCO. Others think that RapidMiner’s adoring users of its free software aren’t willing to pay for what they use. You know, the classic “students and hoboes” problem that makes it hard to monetize open source software.

Perhaps TIBCO will buy Rapidminer. Speaking of which, TIBCO seems to be doing better at analyst relations. Forrester awarded them an above average rating in strategy, FWIW.

SAS published a lot of blog posts this year. The company has a blog farm that spews content like Mount Vesuvius in full eruption. Product-wise, however, all seems quiet in Cary. The June Viya release was underwhelming.

SAS revenue in 2017 was $3.24 billion. If it exceeds $3.40 billion in 2018, I’ll buy Dave Mac lunch at Yum Yum Sushi Thai on Harrison Oaks.

Not Really Muddling

Ayasdi killed off its Federal business and lost its CEO. Headcount declined from 130 to 109 over the year. Ayasdi’s last funding round was three and a half years ago. As a rule, startups that can’t or won’t refund aren’t thriving.

Cloudera announced plans to offer Cloudera Data Science Workbench in the cloud. I doubt that AWS is shaking in its boots about that. CDSW’s main advantage is its integration with Cloudera. Put the same product in the cloud and you have a notebook-based platform for Python and R that doesn’t support Jupyter. I could make a joke about that, but out of respect for my former colleagues at Cloudera, I won’t.

Oracle acquired DataScience.com in June, and on the strength of that secured a “Leader” rating in the Forrester Wave for notebook-based data science platforms. Then everyone died or something because Oracle Data Science Cloud won’t be available until June 2019.

Notice how most of the vendors in the Forrester “Wave” have little circles, but Oracle has just a little dot?

Now flip to the table on page 8 of the report. See that 0.00 under Customer adoption for Oracle? It means that nobody uses the Oracle product.

Oracle Data Science Cloud. It’s the leading product that nobody uses.

Regular readers of this blog may recall that we celebrate the passing of AI/ML vendors with bye-ku. Here’s one for DataScience.com.

A bright future for

Data Science dot com? No…

No, no, no, no, no.

Gartner rated KXEN a Challenger the year before SAP bought the company in 2013. Since then, SAP just keeps sinking deeper into Niche Vendor territory. (The niche, it seems, is “SAP bigots.”) According to Gartner:

SAP has one of the lowest overall customer satisfaction scores in this Magic Quadrant. Its reference customers indicated that their overall experience with SAP was poor, and that the ability of its products to meet their needs was low. SAP continues to struggle to gain mind share for PA across its traditional customer base. SAP is one of the most infrequently considered vendors, relative to other vendors in the Magic Quadrant, by those choosing a data science and machine-learning platform.

In other words, existing customers think SAP sucks, and everyone else stays well clear.

The product itself seems little changed since 2013 or, for that matter, since first launched by KXEN in 1999. Meanwhile, SAP plays this fun game called “Where’s Leonardo?” where they tease everyone with all the wonderful capabilities that will be built into SAP Leonardo while hiding the actual product. Quoting Gartner again:

SAP Leonardo Machine Learning and other components of the SAP Leonardo ecosystem did not contribute to SAP’s Ability to Execute position in this Magic Quadrant.

Translation: we can’t evaluate slideware.

Teradata deprecated Aster, finally, and no longer has a product in the data science and machine learning category. Curiously, at Teradata events, top management still talks about machine learning. Here’s a bye-ku for Teradata’s machine learning business:

Teradata says

It has machine learning but

Sadly it does not.

Goodbye, Teradata. You have enough work to do getting those database customers to stick.


  • I wonder if SAP’s purchase of Qualtrics will face the same future as KXEN. In the year before their purchase, Qualtrics more than doubled their price to universities, and it was already much higher than their competition. SAP may have been impressed with rapid, but unsustainable revenue growth. The U of TN system is in the process of switching to QuestionPro, which offered similar features for about 1/8th of the price. Other universities have been discussing what to switch to on the Educause CIO board. Here’s a summary of our review of web survey products: http://r4stats.com/articles/software-reviews/survey-tools/.

  • Bob — thanks for commenting. Interesting question. It’s easier to integrate surveys into business processes — most people understand the concept and those tools are all easy to use.

    Don’t forget Google Forms, which is quite limited, but good enough for many internal surveys, and free.

  • “In October, DataRobot closed a $100 million Series D round led by Sapphire Ventures and Meritech Capital Partners. Privately held startups don’t release financials, so the only way to tell which startups are going somewhere is to follow the money. Venture capitalists don’t back losers.”

    Venture capitalize frequently invest money in companies that don’t work out. Often times the strategy is to invest a lot of money to give the impression to the market that things are working in hopes to create a self-fulfilling prophecy. Just look at Ayasdi who raised $100+ million. How did that work out?

    I can’t say whether or not DataRobot is doing well, but I wouldn’t get tricked into thinking venture capital money is the sign of success .

    • Thanks for reading. You are correct that VCs are not clairvoyant. Most startups never make it to an IPO, which means that many venture investments don’t produce a return. VCs counter this by spreading their bets across many different companies, figuring that one unicorn more than offsets many failed investments. VCs aren’t all geniuses, and they can have mixed motivations, but they don’t invest in ventures unless they expect them to work out.

      At the seed level, of course, it’s all speculative. But once a company gets to venture rounds, investors want to see solid commercial goals. Startups that accomplish their goals get another round. Ayasdi proves the point that a firm’s success in raising capital is a sign of health. Ayasdi’s failure to raise new funding in 2017 was an early indicator of problems at the firm. Ayasdi never raised a “D” round because it was unable to convince investors that it had a viable story. While raising the “C” round they told investors they would accomplish X, Y, and Z. When they failed to do that, investors ran in the other direction.

      Raising capital is a necessary but not sufficient condition for success. Any company can go off the rails, for many different reasons. A fresh funding round means that a startup has, to date, met or exceeded investor expectations. That’s not a guarantee of continued success — nothing is certain in life except death and taxes. But would you rather work for a company with a track record of success, or one with a history of failure?

  • Thomas – Enjoy reading your blog. Couple questions / clarification on Alteryx. What are you basing your comment on “account expansion is minuscule off”? They have consistently reported net expansion rate (or $-based retention rate) of over 130% which is very strong.

    Also with just ~4300 customers, do you think they are in danger of running out of new logos? Tableau has something like 85K customers and is still adding several K each Q.


    • Ty — thanks for reading and commenting. Yes, I’ve seen Alteryx’s dollar-based net revenue rate. (It’s in the 10-Q). Agree that 130% is pretty good. However, Alteryx’s overall average revenue per customer is very low, and it’s not growing. The average Alteryx customer produces less than $50K per year. That’s nine Designer seats or a couple of seats for geospatial analysis.

      I don’t see that comparison to Tableau is germane. Tableau appeals to a much broader audience than Alteryx. Alteryx just added visualization capabilities and will argue that customers should buy one tool instead of two. But tools like Tableau, PowerBI, and Qlik have a much stronger lock on the end user. It’s just as likely that the three viz leaders will improve their data prep capabilities and make Alteryx obsolete.

      The Alteryx functions that are most widely used are mapping and geospatial analysis, Alteryx’s DNA. That application has a narrower market than general purpose viz.

      • Thomas,

        Full disclosure: I’m an engineer at Alteryx.
        I worked on the first connectors with your current employer.

        What are your thoughts on the Alteryx integration of Jupyter in a new Python tool? You didn’t mention that. Also, Alteryx released Promote, which is the Yhat tech. It’s definitely in use with some large customers who spoke publicly about it.

        Just curious. I like your honest feedback.

        Thanks and love your blog.

      • Thanks for reading, and for commenting.

        First, I think that Alteryx is doing really well as a company. It’s really good for data prep and for mapping.

        Jupyter integration is meh. It doesn’t make Alteryx into a data science collaboration platform like Dataiku, Domino Data, Databricks, or Cloudera Data Science Workbench. It smells like a check-off-the-box thing.

        Promote has about 20 customers, almost all inherited from Yhat. There’s not a lot of evidence that Alteryx can successfully sell this into the existing Alteryx base. Time will tell. It’s an open secret that Alteryx and H2O.ai will announce a partnership soon. That could be interesting.

        Alteryx has low appeal for data scientists. Just 14% of Alteryx reference customers surveyed by Gartner last year said that data scientists use the product. That’s by far the lowest rate of use among the 16 vendors in last year’s MQ. On the other hand, Alteryx has the highest appeal for business analysts. It’s very difficult for any one vendor to serve the needs of both user constituencies. Alteryx would be much better off sticking to its core strength appealing to business users.

  • Dotdotdotdashdashdashdotdotdot

    Nick Lisi is gone from SAS. Thoughts?

    • Several.

      (1) Businesses don’t replace the Chief Sales Officer when sales meet expectations.

      (2) Lisi was a Carl Farrell protege.

      (3) Dave Macdonald, the new CSO, is a sharp guy.

      — Thomas

      • Dotdotdotdashdashdashdotdotdot

        If you saying Dave Mac is a sharp guy is an implication that Lisi and Farrell failed because they are not sharp, then I respectfully disagree. Lisi and Farrell did not fail because they are not sharp. They failed because they were tasked with trying to do the impossible: specifically they were trying to sell evolutionary solutions in a market thirsting for revolutionary solutions.

  • I agree that Farrell and Lisi had impossible tasks, and still think that Dave Mac is a sharp guy.

    The problem, as you note, is that Goodnight wants double-digit growth. And he doesn’t want to hear criticism of the product or pricing from Sales or Marketing. Or customers, for that matter.

    — Thomas

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.