Tag Archives: Apache Giraph

2014 Predictions: Advanced Analytics

A few predictions for the coming year. (1) Apache Spark matures as the preferred platform for advanced analytics in Hadoop. Spark will achieve top-level project status in Apache by July; that milestone, together with inclusion in Cloudera CDH5, will validate the project’s rapid maturation.  Organizations will increasingly question the value of “point solutions” for Hadoop analytics versus Spark’s integrated platform

Read more

Apache Spark for Big Analytics (Updated for Spark Summit and Release 1.0.1)

Updated and bumped July 10, 2014. For a powerpoint version on Slideshare, go here. Introduction Apache Spark is an open source distributed computing framework for advanced analytics in Hadoop.  Originally developed as a research project at UC Berkeley’s AMPLab, the project achieved incubator status in Apache in June 2013 and top-level status in February 2014.  According to one analyst, Apache Spark is among the five

Read more