Databricks Releases Spark Survey

In a press release and blog post, Databricks announces results from its 2016 Spark Survey. Databricks surveyed 1,615 Spark users and prospective users in July, 2016 Respondents include data engineers, data scientists, architects, technical managers, and academics.

Key findings from the survey:

  • Spark SQL remains the most widely used component.
    • 88% use Spark SQL
    • 71% use Spark Streaming
    • 71% use MLlib (machine learning)
  • Respondents value Spark’s performance and advanced analytics.
    • 91% rate performance very important
    • 82% rate advanced analytics very important
    • 76% rate ease of programming very important
    • 69% rate ease of deployment very important
    • 51% rate real-time streaming very important
  • Production use has increased markedly since 2015.
    • 40% use SQL in production, up from 24%
    • 38% use DataFrames in production, up from 15%
    • 22% use streaming in production, up from 14%
    • 18% use machine learning, up from 13%
  • So has usage in the public cloud.
    • 61% said they use Spark in the public cloud, up from 51% in 2015.
  • Usage of Spark deployed on-premises has declined.
    • 42% use Spark in a standalone deployment, down from 48%
    • 36% use Spark under YARN, down from 40%
    • 7% use Spark on Apache Mesos, down from 11%
  • The Scala API remains the most popular, followed closely by the Python API.
    • 65% use Scala, down from 71% in 2015
    • 62% use Python, up from 58%
    • 44% use SQL, up from 36%
    • 29% use Java, down from 31%
    • 20% use R, up from 18%
  • While Linux remains the most popular OS, Mac and Windows usage is growing rapidly.
    • 74% use Linux/Unix, down from 75% in 2015
    • 32% use Windows, up from 23%
    • 22% use Mac OSX, up from 14%

The report also includes statistics about the Spark community at large.

— Databricks reports growth in the contributor base from 600 in 2015 to 1,000 in 2016, a figure that does not seem to square with the statistics reported in OpenHub.

— Spark Meetup membership grew from 66,000 in 2015 to 225,000 in 2016.

— Spark Summit attendance grew from 3,912 to 5,100.

For a copy of the report and an infographic, go here.

Advertisements

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s