Big Analytics Roundup (May 23, 2016)

Google announces that it has designed an application-specific integrated circuit (ASIC) expressly for deep neural nets. Tech press goes bananas. The chips, branded Tensor Processing Units (TPUs) require fewer transistors per operation, so Google can fit more operations per second into the chip. In about a year of operation, Google has achieved an order of magnitude improvement in performance per watt for machine learning.

Google’s Felipe Hoffa summarizes Mark Litwintschik’s work benchmarking different platforms with the New York City Taxi and Limo Commission’s public dataset of 1.1 billion trips. So far, Mark has tested PostgreSQL on AWS, ElasticSearch on AWS, Spark on AWS EMR, Redshift, Google BigQuery, Presto on AWS and Presto on Cloud Dataproc. Results make Google look good, but you should read Mark’s original posts.

Meanwhile, IBM fires more people. More here and here.

Open Data Science Conference

The second annual Open Data Science Conference (ODSC) East met in Boston over the weekend. Attendance doubled from last year, to 2,400.

Registration was a snafu, because the conference organizers did not accurately predict walk-in traffic or staffing needs. The jokes write themselves.

Content was excellent. Keynoters included Stefan Karpinski (Julia co-creator), Kirk Borne of Booz Allen Hamilton, Ingo Mierswa, CTO of RapidMiner and Lukas Biewald, CEO of Crowdflower. Track leaders included JJ Allaire and Joe Cheng of RStudio, Usama Fayyad of Barclays and John Thompson of the US Census Bureau. Sponsors included Basis Technology, CartoDB, CrowdFlower, Dataiku, DataRobot, Dato, Exaptive, Facebook, H2O.ai, MassMutual, McKinsey, Metis, Microsoft, RapidMiner, SFL Scientific and Wayfair.

Prompted by a tweet, I stopped at the Dataiku table. The conversation went like this:

  • Me: What does Dataiku do, in 25 words or less?
  • Dataiku: DataRobot.
  • Me: What?
  • Dataiku: We do what DataRobot does.

At this point, it was clear to me that Mr. Dataiku either did not know what DataRobot does, or thought I don’t know what DataRobot does. So I changed the subject.

The next ODSC event is in October, in London.

Explainers

— Michael Armbrust and Tathagata Das explain Structured Streaming in Spark 2.0

— Adrian Colyer goes 5 for 5 for the week:

— Tim Hunter, Hossein Falaki and Joseph Bradley explain HyperLogLog and Quantiles in Spark.

— Microsoft’s Raymond Laghaeian explains how to use Azure ML predictions in Google Spreadsheet.

Perspectives

— Serdar Yegulalp cites PayScale data in noting that if you know Scala, Go, Python and Spark you can expect to make more money.

— Tim Spann weighs the advantages of Java and Scala, and explains DL4J.

— Sam Dean celebrates Drill’s first anniversary.

— Taylor Goetz delivers a brief history of Apache Storm.

Open Source Announcements

— MongoDB releases a new Spark Connector.

— Apache Tajo announces Release 0.11.3, with five bug fixes.

— Apache Mahout announces Release 0.12.1, a maintenance release that resolves an issue with Flink integration.

Commercial Announcements

— RedPoint Global snags a $12 million “C” round.

— TIBCO announces something called Accelerator for Apache Spark, a bundle of tools that connect TIBCO products with open source packages. While TIBCO refers to this component as open source, the software is available only to TIBCO customers, which means it isn’t Free and Open Source.

— MapR applauds itself.

Advertisements

One comment

  • Thomas,

    “The jokes write themselves.” I’m laughing my butt off here in TN!

    Bob

    On Mon, May 23, 2016 at 6:55 PM, The Big Analytics Blog wrote:

    > Thomas W. Dinsmore posted: “Google announces that it has designed an > application-specific integrated circuit (ASIC) expressly for deep neural > nets. Tech press goes bananas. The chips, branded Tensor Processing Units > (TPUs) require fewer transistors per operation, so Google can fit m” >

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s