SAS Peddles Open Source FUD

This appeared in my Twitter feed recently:

Challenge…accepted. You can find a copy of SAS’ report here, and Matt Asay’s excellent analysis here. Grab your popcorn; here are four quick points.

(1) For SAS, it’s progress.

The report opens with this:

Open source technologies, like Hadoop, R, and Python, have been vital to the spread of big data.

That’s quite an admission for SAS, a company that embraces open source the way Donald Trump embraces Angela Merkel. You may remember this comment by SAS executive Ann Milley a few years ago:

We have customers who build engines for aircraft. I am happy they are not using freeware when I get on a jet.

That comment drew a lot of attention. Astute observers pointed out that SAS bundled Apache Tomcat and other open source software into its products. Hilarity ensued.

(2) But first, a parade of horribles.

SAS wants you to know that open source users are in for a heap of trouble.

Deploying open source software at an enterprise level is challenging.

Commercial software, on the other hand, is so easy to deploy a caveman can do it. Just ask anyone who implements SAP.

Projects may be derailed by the requirement to scale up to operational levels of reliability and performance using a complex set of open source tools – tools that require coding experts who are notoriously hard to find.

You know what’s even harder to find? Data scientists who use SAS.

The report continues by quoting former SAS executive Robin Way. Robin currently leads Corios, a consulting firm with Gold status in SAS’ partner network; he has spent his entire career working for SAS and its partners.

Not that there’s anything wrong with that.

Robin details the trials of those who venture down the open source path:

In one example…a bank took four years to consider the roadmap for open source analytics…

Four years. For a roadmap. They should have hired me.

…but decided that the total cost of ownership would be more expensive than using a single consolidated platform…In this case, the platform chosen was SAS.

There are some good reasons to use SAS. Saving money is not one of them.

In another example, an insurer may have to abandon 18 months’ work after standardizing all new model development in Python and all model deployment in Scala. Despite substantial work, none of their Scala model translations match the results from their Python model development, and nobody in the company knows how to fix this problem.

That story is ridiculous on multiple levels:

— Nobody spends 18 months building code without testing the solution first.

— Why bother to translate Python to Scala? Just deploy the Python code.

— Better yet, train models in Apache Spark or H2O from the Python API and deploy them as POJOs under the Scala API.

We’re expected to believe that this insurance company was totally hot for open source Python and Scala, but never heard of Spark and H2O? Please.

Meanwhile, here’s what Gartner says about implementing and managing SAS:

The SAS products considered in this Magic Quadrant are difficult to manage. Over half of SAS’s reference customers pointed to difficulties with the initial deployment or version migration. Several indicated instability and bugs.

Now there’s a real parade of horribles.

(3) The “true costs” of open source software.

SAS cautions repeatedly that executives aren’t considering the “true costs” of open source software:

The anecdotal evidence is mounting that the true costs of open source are not necessarily understood and that organizations are underestimating the range of considerations when it comes to deploying these systems and keeping them operational.

That sentence would be more credible if it were supported with, you know, actual anecdotes.

Many organizations aren’t taking into account all the relevant factors when it comes to the true cost of open source, leaving them potentially exposed – for example, only half currently take into account the time to fix/resolve issues…

SAS, of course, is completely free of defects and never requires time to fix/resolve issues.

…and a minority (41%) contemplate the need to replace expertise if employees leave.

This is a very silly point. There are more people in the job market with open source skills than with SAS skills.

Despite the widely accepted difference in skills required for open source…only around six in 10 are currently taking into account open source training costs for employees or hiring costs for open source practitioners.

I like that part about “widely accepted.” Here in Boston, one of the big insurance companies has a large investment in SAS. Every junior data scientist they hire already knows R, and most know Python. For SAS, however, they have to send them out for training.

While SAS drops hints about the “true costs” of open source software, it offers no evidence for these “hidden” costs anywhere in the report. Let’s go back to Gartner for this nugget about the “true cost” of SAS:

SAS’s pricing remains a concern. Open-source data science platforms are often used along with SAS’s products as a way to control costs, especially for new projects.

With SAS, you don’t have to worry so much about “hidden” costs. They’re right out front, staring you in the face, every time you get an invoice.

(4) SAS doesn’t believe its own FUD.

Does SAS seriously believe that open source software poses a security risk? If so, they wouldn’t want customers to install their software on open source operating systems, since a software stack is only as secure as its weakest component.

Let’s check out Viya, the company’s highly touted new product:

Supported Operating Systems

The following operating systems are supported:

  • Red Hat Enterprise Linux 6.7 (64-bit) and later within 6.x
  • Red Hat Enterprise Linux 7.1 and later within 7.x
  • Oracle Linux 6.7 and later within 6.x
  • Oracle Linux 7.1 and later within 7.x

I guess they’re still working on the Windows version.

Presumably, SAS doesn’t want you to install Viya on a free version of Linux:

If you use an alternative operating system, you must have the appropriate skills to resolve differences between the supported operating system and the alternative operating system. By using an alternative operating system, you acknowledge that you can resolve the differences inherent in that alternative system.

That’s kind of a soft disclaimer, isn’t it? It’s standard CYA boilerplate, not “Russian hackers will rob you blind if you put this on Debian.”

Let’s move on to supported databases. Hackers rarely target analytic applications, because the workloads are transient. But a hacker might want to use an analytic tool to get inside a database to mine the really good stuff. SAS leverages database security, so I assume it only works with commercial databases, which, as you know, are totally unhackable:

Supported Data Sources

SAS Viya supports the following data sources:

  • Apache Hive
  • Impala
  • Oracle
  • PC files
  • PostgreSQL
  • Teradata
  • Other data sources accessible via ODBC drivers

So much for that theory.

But at least with open source databases, the risks are out in the open. I’m sure that Viya has no open source dependencies:

Java Requirements

The Java Runtime Environment (JRE) must be installed on every machine in your deployment. Only the JRE is required, not the full JDK. The following versions are supported:

  • Oracle JRE SE version 1.8.x
  • OpenJDK version 1.8.x

Note: This open-source version of Java is included with Linux.


Security is paramount to SAS. They wouldn’t use open source software for that.

Apache httpd

The deployment automatically installs Apache httpd from Linux repositories if it is not detected on the machines that you designate as targets for the HTTP Proxy installation. Apache httpd is required to create the Apache HTTP Server, which provides security and load balancing for multiple SAS Viya components


I’m flying to Amsterdam in September. I sure hope KLM doesn’t use SAS.



  • Interested Onlooker

    Thats so funny…….Theres nothing like a un biased analysis 🙂

    you were doing so well until your last sentence… “flying to Amsterdam on Sunday” Now we all know why your going 😉

    • It says “flying to Amsterdam in September,” not Sunday. So no, I’m not attending the event – it sounds interesting, but I’m not affiliated with them. It’s a vacation with my family. We’re planning to spend time with the Rembrandts, the Vermeers, and the Van Goghs.

  • Really fun reading your rebuttal and SAS does seem to be on the wrong side of the argument. Given that the entrenched enterprise solutions are being rapidly obsolesced, I have to wonder if FOSS is ready to fill the gap? My guess is that tools like Spark, R, and H2O are already displacing SAS so the answer is ‘absolutely’, but there are tradeoffs: complexity, integration, complexity, etc. Playing out this transition in the industry, I think we’re in for some interesting times.

    • LOL.

      According to Google Translate, the user group working with SAS is “bulky and diverse.”

      It says that KLM uses SAS “to ensure that passengers get the newspaper of their choice during their flight.” That seems like overkill. Usually, they just throw them in a pile on the jet bridge and let people take what they want.

  • It says that KLM uses SAS “to ensure that passengers get the newspaper of their choice during their flight. Surveys of working data scientists show that FOSS tools have already displaced SAS.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s