The missing part of the Machine Learning revolution

Despite the widespread adoption of AI, scaling and deploying AI-based products is as hard as ever; but some new technology is looking to change that reality

Published in

Towards Data Science

6 min readNov 15, 2017

(Disclaimer: I’m not employed by Algorithmia or affiliated financially with the company in any way. I’m just someone with a Data Science background who finds the company compelling.)

There’s no doubt that we’re entering the age of AI, with Machine Learning touching almost everything we’re involved in on a day-to-day basis. Spurred on by step innovations in data storage and computing power, Neural Nets are back from the 70’s with a bang. Medicine, security, customer service, fraud detection, you name it — there are well funded companies applying Machine Learning to improve and augment it. Heck, you might have even found this post through Medium’s Machine Learning-based recommender systems.

Deep Learning, for whatever reason, seems to work really well for a number of problems with immediate impact. You might even call it a revolution.

It’s getting easier to create deep learning models, but not to deploy them at scale

While data storage and the flight of Nvidia have certainly helped spur on this revolution, one of the major drivers of today’s state of Machine Learning is the ease with which you can actually create working, accurate models. Machine Learning is experiencing significant abstraction — new tools are making it easier than ever to get AI off the ground.

In addition to private companies that offer feature-rich APIs for specific tasks like Clarifai and Indico, the third-party package ecosystem in popular data science languages like R and Python is growing exponentially. Google released the initial version of TensorFlow in November 2015, and it’s taken off like a rocket ever since (in addition to the already popular ScikitLearn). For data scientists, creating sophisticated models within a testing environment has gotten much easier.

Unfortunately, that ease doesn’t leave the iPython Notebook that it starts in. That’s because getting a Machine Learning model to work in production is a very different task from getting it to work on your computer. Deploying your models means getting them to work when called upon, at scale, and in the way you want them to. Creating theoretically accurate models is useless if they’ll fall apart once you start serving them to customers.

There are a totally new set of challenges that you need to worry about, a new set of skills to master, and different metrics to measure your success by.

Deployment is very different from model creation, and very hard for both small and large companies

Like any distributed application, deploying Machine Learning models is extremely difficult, and a totally different task than creating them in the first place. This is true on multiple dimensions:

Who: model building is done by data scientists and machine learning researchers, while deployment is done by software engineers, machine learning engineers, and data engineers.

Metrics: the goal of model building is to create something that predicts accurately, while the goal of deployment is to predict quickly and reliably.

Where: model building is typically done on a few virtual servers by a few people, while deployed models need to be able to scale up or down and handle thousands or millions of API requests.

These are just a few ways in which deployment is a different game. It’s also really hard to do — it involves juggling different skillsets, priorities, and capabilities. What if your most accurate models take too long to run? How do you update your models with new data? How do you optimize for speed by diversifying across geographies?

This problem extends across company sizes. For early stage startups looking to develop and run a product around Machine Learning models, deployment is a mess. It’s enough of a challenge to hire the proper software and data engineering talent as it is; it’s even harder when you’re trying to get a product off the ground, and your resilience depends on a new group of people getting your models to run. The skillset that data scientists have and that got you here (accurate models) won’t get you there (deployed at scale).

This issue doesn’t get any easier as your company grows — in fact, in some ways it’s actually most pronounced in the enterprise. Data science teams develop impactful models and products, but they need to get them working and scalable; that means turning to other engineering teams who don’t necessary have the right background. And yet, data scientists need to rely on them to properly port models, tune hyperparameters, and decide on batch size. By the time their fellow team has navigated through all of challenges of Machine Learning deployment it might be 4 months later or more, and the models don’t look or act anything like what the data science team built in the first place.

In short, it’s a real headache that many early stage companies can’t solve, and many enterprises can’t handle. One of the common solutions to the deployment problem is to use a horizontal platform, but they’re not a good fit for most companies. Essentially, you keep your own data but use an API to built quick and dirty models that reside on a vendor’s servers; they scale out for you and worry about how to make things work. Some examples of platforms that fit this mold are BigML and Seldon.

Unfortunately, horizontal deployment platforms aren’t always useful because of how they’re built: if you’re a company building a product with heavy Machine Learning involved, you can’t have your models residing with a third party. You want to create your own sophisticated algorithms, whether they’re neural nets on TensorFlow or anything else.

The problem is that there’s no product that just takes care of the last mile — something that allows you to develop your models however you want, and then just takes care of the rest. Thankfully, that’s changing.

Algorithmia is solving the last mile problem by offering deployment as a service

Algorithmia is releasing a new product that solves this issue, but still leaves modeling and data where it should be: in the hands of your data scientists. It’s called the Enterprise AI Layer, and it essentially automates Dev Ops for Machine Learning deployment while letting you worry about what matters — creating great models and products.

The Enterprise AI Layer covers all the bases that you’d expect in a scalable deployment solution. It’s cloud agnostic, scales with what you need, allows you to choose between CPUs and GPUs, and is extremely low latency. Algorithmia’s platform was also designed by and for Dev Ops — that means extensively detailed dashboards, and tracking of all the right metrics to make sure your deployments live up to your customer demand.

“As someone that has spent years designing and deploying Machine Learning systems, I’m impressed by Algorithmia’s serverless microservice architecture — it’s a great solution for organizations that want to deploy AI at any scale”
Anna Patterson, VP of Engineering, Artificial Intelligence at Google*

But aside from the technical specs, Algorithmia’s AI Layer is important because of how it changes the way organizations can look at Machine Learning. Right now, Machine Learning is like any application — you need to deal with all of the infrastructure before sending new data and predicting. Just like an API that makes calls to the Yelp App, your API makes calls to your models. It’s a type of application, which means that your team needs application deployment expertise.

That’s different now, because your team can focus exclusively on creating great models and not on the infrastructure that makes them work. This is in the mold of what Serverless has done for data storage with Google’s BigQuery and Amazon’s Athena: it allowed organizations to focus on analyzing their data instead of the technical complexities of storing it everywhere. Data analysis, where money is made, was abstracted from data storage. Now modeling can be abstracted from deployment.

This is awesome: it means more ideas can be turned into products, and more products can get past the monotony of giant engineering teams and corporate backlogs. It means that as a data scientist, you can do what you really want to: focus on building great ideas and models, and not on how to deal with engineering their back-ends. And I think that’s a pretty big deal.

*Anna and Google are investors in Algorithmia.

The missing part of the Machine Learning revolution

Despite the widespread adoption of AI, scaling and deploying AI-based products is as hard as ever; but some new technology is looking to change that reality

It’s getting easier to create deep learning models, but not to deploy them at scale

Deployment is very different from model creation, and very hard for both small and large companies

Algorithmia is solving the last mile problem by offering deployment as a service

Written by Justin Gage