Machine learning (ML) and deep learning (DL) content from the past 24 hours.
This is not a political site, but the election raises some interesting issues on topics we cover: data, advanced analytics, polls and venture capital.
— The election was an epic fail for poll aggregation and forecasting sites like FiveThirtyEight, most of whom got the election completely wrong. On KDnuggets, Gregory Piatetsky surveys the damage.
The problem is GIGO; no analysis technique can produce an accurate prediction when the data is flawed. Surveys are subject to many forms of bias: sampling, response, question wording, question order. Social scientists know how to conduct scientific surveys, but polling firms take shortcuts to save time and cut costs. Averaging poll results reduces noise from statistical sampling error and partially compensates for house effects, but does not correct for a systematic bias that cuts across the polling industry.
— Data scientists who claim to have transformed political campaigns may also have to take another look at their methods; data-driven GOTV campaigns didn’t get the voters out. In an O’Reilly podcast, Andrew Therriault touts his ill-timed book.
— Today’s surging financial markets gut the premise of John Shieber’s sky-is-falling post-election outlook for Silicon Valley. Shieber frets about the future of the H-1B visa, a program that keeps people in a kind of indentured servitude so that Infosys, IBM, Tata et. al. can line their pockets.
Machine Intelligence Landscape
Dave Ramel writes: The popular Apache Spark project is poised to break from the Hadoop ecosystem as an independent data processing tool, and it may shift from on-premises installations to the cloud, according to new research.
It seems that Dave did not read the Databricks Spark user surveys from 2015 or 2016, which show that Spark has already broken from the Hadoop ecosystem, and has already moved from on-premises installations to the cloud. In fact, today more people use Spark as a standalone processing platform than use it with Hadoop, and more people use it in the cloud than on premises.
Other than that, Dave writes a fine story.
— Shohei Hido describes Chainer, an open source framework for deep learning.
— Libby Kinsey interviews GraphCore CTO Simon Knowles, who explains how to build a processor for machine learning.
— Nimbix uses NVIDIA Pascal GPUs in its HPC cloud service, while Microsoft Azure uses GPUs based on the old Kepler and Pascal architectures.
— Chris Hannam explains the use of predictive algorithms to track real-time health trends.
— Sven Krasser of CrowdStrike describes how machine learning can address cyber security issues.
— Another startup, Darktrace, uses ML to filter noise on IT networks and spot emerging cyber threats.
— eToro, a social trading network, offers investment funds consisting of assets selected with machine learning.
— Researchers at Oak Ridge National Laboratory use deep learning running on the Titan supercomputer to extract knowledge from cancer pathology reports.
— Scientists at Hanyang Institute in Korea develop a deep learning system for more accurate blood pressure readings.
— SAP burns bandwidth to tout three underwhelming machine learning initiatives.