Category Archives: Thoughtware

Predicting the 2019 MQ

The die is cast. Last month, Gartner selected 16 vendors to include in its 2019 Magic Quadrant for Data Science and Machine Learning. Now, as Gartner prepares to publish the report early next year, I think it will be fun to make some predictions about how each vendor will fare. Some ground rules. I’m not going to talk about DataRobot,

Read more

Benchmark: Spark Beats MapReduce

A group of scientists affiliated with IBM and several universities report on a detailed analysis of MapReduce and Spark performance across four different workloads.  In this benchmark, Spark outperformed MapReduce on Word Count, k-Means and Page Rank, while MapReduce outperformed Spark on Sort. On the ADT Dev Watch blog Dave Ramel summarizes the paper, arguing that it “brings into question..Databricks Daytona GraySort claim”.  This point refers to Databricks’ record-setting

Read more

O’Reilly Data Science Survey 2015

O’Reilly releases its 2015 Data Science Salary Survey.  The report, authored by John King and Roger Magoulas summarizes results from an ongoing web survey.  The 2015 survey includes responses from “over 600” participants, down from the “over 800” tabulated in 2014. The authors note that the survey includes self-selected respondents from the O’Reilly audience and may not generalize to the

Read more

Spark 1.4 Released

On June 11, the Spark team announced availability of Release 1.4.  More than 210 contributors from 70 different organizations contributed more than 1,000 patches.  Spark continues to expand its contributor base, the best measure of health for an open source project. Spark Core The Spark team continues to improve Spark operability, performance and compatibility.  Key enhancements include: The first phase in

Read more
« Older Entries