Tag Archives: Apache Spark

Roundup 10/28/2016

Machine learning (ML) and deep learning (DL) content from the past 24 hours. Plus, some AI stuff. I’m publishing a three-part series on the state of enterprise machine learning in The Next Platform. Part one is here. Good Reads — Helen Beers explains AI in the second part of a series. Part one is here. — Market Realist publishes a twelve-part

Read more

Spark 2.0.1 Arrives

The Spark team announces the availability of Spark 2.0.1, a maintenance release with more than 300 stability and bug fixes. Release notes here; list of changes here. Also, Databricks’ Jules Damji publishes the latest bi-weekly roundup of Spark news from around the web; and CERN’s Luca Canali investigates Spark 2.0 performance improvements.

Read more

Machine Learning Roundup (October 3, 2016)

Machine learning (ML) and deep learning (DL) content from Friday and the weekend. Scroll to the bottom for job postings. ICYMI, the roundup is now daily and focuses solely on machine learning and deep learning. Top stories from last week: — Google releases Cloud Machine Learning to public beta. — NVIDIA introduces System-on-Chip for Autonomous Vehicles. — Amazon, Facebook, Google,

Read more

Databricks Releases Spark Survey

In a press release and blog post, Databricks announces results from its 2016 Spark Survey. Databricks surveyed 1,615 Spark users and prospective users in July, 2016 Respondents include data engineers, data scientists, architects, technical managers, and academics. Key findings from the survey: Spark SQL remains the most widely used component. 88% use Spark SQL 71% use Spark Streaming 71% use

Read more

Big Analytics Roundup (September 6, 2016)

Jim Kyung-Soo Liew and Tamas Budavari of Johns Hopkins ask whether Tweet sentiments still predict the stock market. Short Version: they do, but the market has arbitraged away any advantage from trading on the information. So there you have it: the stock market is efficient with respect to fundamental information, technical information, and Tweets. Enterra’s Stephen DeAngelis celebrates the “Algorithmic

Read more

Big Analytics Roundup (August 29, 2016)

TechCrunch reports results of a new study, which says that you really don’t need a co-founder after all. Next, they’ll be telling us we don’t need to floss. Python and R Matt Asay argues that Python is a gateway language that leads data scientists to R. (h/t Oliver Vagner). That’s oversimplified and mostly incorrect. The breadth of R’s analytics functionality tends

Read more

Big Analytics Roundup (August 22, 2016)

MIT Technology Review reports that Chicago’s experiment in predictive policing isn’t working. Data scientists developed a list of a few hundred people likely to commit a shooting; police, however, ignored the predictions, primarily because nobody told them what to do with individuals on the list. The report illustrates a fundamental truth about data science: no amount of insight matters unless your

Read more

Big Analytics Roundup (August 1, 2016)

There are two big stories this week: Apache Spark 2.0 and Apache Mesos 1.0. There’s also a new release from Kylin, and a nice crop of explainers. IEEE Spectrum publishes its third annual ranking of top programming languages, based on twelve metrics drawn from Google Search, Google Trends, Twitter, GitHub, Stack Overflow, Reddit, Hacker News, CareerBuilder, Dice, and the IEEE

Read more

Spark 2.0 Released

The Apache Spark team announces the production release of Spark 2.0.0.  Release notes are here. Read below for details of the new features, together with explanations culled from Spark Summit and elsewhere. Measured by the number of contributors, Apache Spark remains the most active open source project in the Big Data ecosystem. The Spark team guarantees API stability for all production

Read more
« Older Entries Recent Entries »