H2O.ai Releases Steam
With minimal fanfare, H2O.ai releases the Steam AI engine, which it describes as “…an end-to-end platform that streamlines the entire process of building and deploying smart applications.” Overall, the software looks like a reasonable product extension for data science teams committed to H2O for machine learning.
H2O.ai’s product strategy for Steam has evolved over time; as late as mid-August, the company planned to release Steam under a commercial license. In September, the company settled on an open source release under an AGPL license and laid off about ten inside sellers and account executives. The layoffs make sense in that context — there’s not much point in having sellers when there is no product to sell.
For cluster management, Steam can start and stop H2O clusters under YARN, or connect to predefined clusters. This capability decouples H2O software administration from Hadoop provisioning; however, it’s a stretch to characterize this as support for elastic computing. True elastic machine learning doesn’t simply decouple the software from provisioning, it manages the provisioning as well, scaling out according to the demands of the job. An H2O user who needs more computing power for a particular job will still have to contact an administrator.
While H2O.ai accurately states that Steam works on all major cloud platforms, it does so under an IaaS/PaaS model. In other words, a user or administrator manually procures needed compute instances. That is quite a contrast to managed services like Qubole (which scales out and back automatically), or Databricks (which offers self-service provisioning in an integrated notebook.)
The model management capability enables an H2O user to save models, manually build a leaderboard and compare model performance. For model deployment, Steam offers the user a capability to deploy models to services accessible either through an API or a REST interface. These are useful capabilities for organizations that plan to rely exclusively on H2O for machine learning.
For the record, however, I am skeptical of any model management and deployment facility that is tightly coupled to a single model training platform. Data scientists use diverse tools for machine learning; no single tool or platform meets all needs. An enterprise model management and deployment should manage all of an organization’s models regardless of the tools for model training. That’s not a criticism of H2O.ai — you can’t blame vendors for moving forward with their own deployment tools — it’s a caution to clients to avoid single-platform solutions to the model management and deployment problem.