Tag Archives: Spark SQL

Big Analytics Roundup (September 26, 2016)

Note to readers: Recently, I’ve noticed that news about events that occur on Tuesdays seems stale by the time I publish on Monday. Beginning this week, I’m shifting to a new publication model, posting analysis of events as they happen instead of a weekly roundup. You could say I’m switching from batch updates to real-time updates, which should please Nathan Marz.

Read more

Big Analytics Roundup (September 19, 2016)

Many thanks to Australia’s Dez Blanchfield for his contributions to this roundup. We set out to create a special “Australia/APAC” edition; however, most of the stories have a global interest: chips are chips and deep learning is deep learning wherever you live. We did find this story, profiling a Tasmanian oyster farm that uses Microsoft’s IoT hub. Well, that’s embarrassing. MapR’s

Read more

Spark 2.0 Released

The Apache Spark team announces the production release of Spark 2.0.0.  Release notes are here. Read below for details of the new features, together with explanations culled from Spark Summit and elsewhere. Measured by the number of contributors, Apache Spark remains the most active open source project in the Big Data ecosystem. The Spark team guarantees API stability for all production

Read more

Big Analytics Roundup (February 29, 2016)

Happy Leap Day.  Tachyon’s rebranding as Alluxio, release of CaffeOnSpark and GA for Google Cloud Dataproc lead the hard news this week.  The Alluxio announcement has inspired big thinkers to share big thoughts.  And, we have a nice crop of explainers.  Scroll down to the bottom for another SQL on Hadoop benchmark. Explainers — In SearchDataManagement, Jack Vaughn explains Spark

Read more

Spark 1.6 Released

The Spark team announces release of Spark 1.6.0.  For a full list of new features, review the release notes here.  On the Databricks blog, Michael Armbrust, Patrick Wendell and Reynold Xin announce the release and summarize key enhancements.  In November, the same authors announced a preview of the release. The contributor base continues to grow, as shown in the chart

Read more

2015 in Big Analytics

Looking back at 2015, a few stories stand out: Steady progress for Spark, punctuated by two big announcements. Solid growth in cloud-based machine learning, led by Microsoft. Expanding options for SQL and OLAP on Hadoop. In 2015, the most widely read post on this blog was Spark is Too Big to Fail, published in April.  I wrote this post in

Read more
« Older Entries