Tag Archives: Python

Big Analytics Roundup (December 7, 2015)

Cloudera’s expanded Spark support leads the news this week, together with a Data Science Virtual Machine from Microsoft.  Neural network devotees will be pleased to see that Keras now runs on TensorFlow. On the Databricks blog, H2O.ai’s Michal Malohlava describes Sparking Water, a Spark package that enables data scientists to build machine learning pipelines that integrate Spark and H2O functions.

Read more

Big Analytics Roundup (November 9, 2015)

My roundup of the Spark Summit Europe is here. Two important events this week: H2O World starts today and runs through Wednesday at the Computer History Museum in Mountain View CA.   Yotam Levy summarizes here and here. Open Data Science Conference meets November 14-15 at the Marriott Waterfront in SFO Five backgrounders and explainers: At HUG London, Apache’s Ufuk Celebi

Read more

Big Analytics Roundup (July 27, 2015)

Top stories this week:  Palantir’s valuation grows, Continuum Analytics gets a bump, Cloudera announces a Python interface for Impala, and we have a winner in KDD Cup 2015. Nate Desmond chronicles Palantir‘s $15 Billion growth story just as the company hits $20 Billion. Conversion Logic wins the KDD Cup 2015, which L.A. Biz characterizes as the “Nerd Olympics”. Here’s a picture

Read more

Big Analytics Roundup (June 8, 2015)

With Spark Summit 2015 coming up in San Francisco next week, expect lots of announcements in the coming week from vendors seeking to catch the wave. In HBR, Narrative Sciences CEO Stuart Frankel argues that the companies driving big salaries for data scientists are stupid.  Okay, he doesn’t actually say they’re stupid, but clearly thinks that data scientists aren’t worth the

Read more

Big Analytics Roundup (June 1, 2015)

The Open Data Science Conference launched successfully in Boston this past weekend, attracting more than 1,200 attendees.  Sponsors included Booz Allen, Continuum Analytics, DataRobot, McGraw Hill Education and RStudio, among others.  Organizers plan additional events this year in Boston and San Francisco. Mary Meeker releases her latest Internet Trends Report. In Forbes, Louis Columbus rounds up analyst coverage of the Big Analytics

Read more

Big Analytics Roundup (May 25, 2015)

This week features new releases from Drill and Hive, plus announcements from DataStax and MemSQL. Andrew Brust summarizes the SQL options presented by Drill, Hive and Spark, noting that Drill’s “SQL everywhere” approach and DBMS vendors’ federated engines make the term “SQL on Hadoop” obsolete. Gartner surveys its panel of 284 people who rely on Gartner and concludes that Hadoop

Read more

Python for Analytics

A reader complains that I did not include Python in a survey of Machine Learning in Hadoop.  It’s a fair point.  There was a lively debate last year between R and Python advocates, variously described as a war or a boxing match.  Matt Asay argued that Python is displacing R; Sharon Machlis and David Smith countered.  In this post I review the

Read more

Automated Predictive Modeling

A colleague asks: can we automate predictive modeling? How we answer the question depends on the context.   Consider the two variations on the question below, with more precise wording: Can we completely eliminate the need for expertise in predictive modeling — so that an “ordinary business user” can do it? Can we make expert analysts more productive by automating

Read more
Recent Entries »