Tag Archives: SparkR

Big Analytics Roundup (May 2, 2016)

Movidius ups the ante for trade show trinkets by releasing what journos describe as supercomputing, neural computing power, vision processing, deep learning, and artificial intelligence on a USB drive.  Roundup here. Last November, IBM’s Paul Zikopoulos snarked at Cloudera for not supporting SparkR. Cloudera’s Sean Owen, responding to a query in the Cloudera Community, notes that SparkR “does not work with other resource managers,” and

Read more

Big Analytics Roundup (April 18, 2016)

In hard news this week, Storm hits a milestone with Release 1.0, Google releases TensorFlow 0.8 with distributed computing support, and DataStax announces DataStax Enterprise Graph. And, following on NVIDIA’s DGX-1 announcement last week there are a number of items on Deep Learning featured below. Deep Learning — Adrian Colyer summarizes a paper that summarizes 900 other papers on Deep Learning. —

Read more

Big Analytics Roundup (April 4, 2016)

Strata + Hadoop World sparks a number of commercial announcements: AtScale has a new release, Microsoft previews R Server on HDInsight, and IBM puts Spark on a mainframe, FWIW. We also have a nice harvest of explainers and perspectives. Slides from Strata available here. The folks at Domino Data ask: Is XGBoost 10X faster than H2O? We’ll never know the answer, since they

Read more

R Interface to Apache Spark

The team at AMPLab has announced a developer preview of SparkR, an R package enabling R users to run jobs on an Apache Spark cluster.   Spark is an open source project that supports distributed in-memory computing for advanced analytics, such as fast queries, machine learning, streaming analytics and graph engines.  Spark works with every data format supported in Hadoop, and supports YARN 2.2. SparkR exposes the Spark

Read more