Thomas Dinsmore's Blog

Apache Spark is 1.0

May 30, 2014

Written by:

Today, the Spark project announced availability of Apache Spark 1.0.0, the first major release since the Apache Foundation named Spark a top-level project. (Additional announcements here, here and here). With 117 contributors, Spark continues to build critical mass and engagement in the data science community.

Features of the new release include:

API stability
Integration with YARN security
Operational and packaging improvements
Spark SQL (Alpha)
MLLib enhancements, including
- Support for sparse feature vectors
- Scalable decision trees for classification and regression
- Distributed SVD and PCA
- Model evaluation functions
- L-BFGS optimization primitive
GraphX enhancements, including performance improvements in graph loading, edge reversal and neighborhood computation
Streaming enhancements, including optimized performance for stateful stream transformations, improved Flume support and automated state cleanup for long-running jobs
Extended Java and Python support
Significant improvements to documentation

…and many small improvements, documented in the Release Notes.

For more information on Spark, read this backgrounder.

Leave a comment Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Thomas Dinsmore's Blog

Apache Spark is 1.0

AI Is Coming For Your Job!!!

Spring 2024 Preview

More on AI Venture Funding

Apache Spark is 1.0

Share this:

Leave a comment Cancel reply

AI Is Coming For Your Job!!!

Spring 2024 Preview

More on AI Venture Funding