Tag Archives: MapReduce

2016 Big Analytics Predictions Roundup

Before publishing my own predictions for 2016 later this week, I thought it would be fun to round up published predictions on analytics and Big Data.  Looking through this list, I see a few patterns: — Streaming is hot.  Analysts do not seem to understand distinctions between streaming data, streaming analytics and real-time decisioning. — “Data Science” continues to be a

Read more

Benchmark: Spark Beats MapReduce

A group of scientists affiliated with IBM and several universities report on a detailed analysis of MapReduce and Spark performance across four different workloads.  In this benchmark, Spark outperformed MapReduce on Word Count, k-Means and Page Rank, while MapReduce outperformed Spark on Sort. On the ADT Dev Watch blog Dave Ramel summarizes the paper, arguing that it “brings into question..Databricks Daytona GraySort claim”.  This point refers to Databricks’ record-setting

Read more

Spark is Too Big to Fail

Reacting to growing interest in Apache Spark, there is a developing contrarian meme: David Ramel asks: are Spark and Hadoop friends or foes? Jack Vaughan compares Spark to the PDP-11, dismisses it as “just processing.” Doug Henschen praises Spark, pans Databricks Nicole Laskowski complains that Spark Summit East “felt like a Databricks show.” Andrew Oliver thinks Spark needs to grow up Andrew

Read more
« Older Entries