Deploying Machine Learning Models on Heroku

Machine learning is a continuous process that involves Data extraction, cleaning, picking important features, model building, validation, and deployment to test out the model on unseen data.

While the initial data engineering and model building phase is fairly a tedious process and requires a lot of time to be spent with Data, model deployment may seem simple, but it is a critical process and depends on the use case you want to target. You can cater the model to mobile users, websites, smart devices, or through any other IoT device.

One can choose to integrate the model in the main application, include it in SDLC, or the cloud. There are various strategies to deploy and run the model in the cloud platform, which seems a better option for most of the cases because of the availability of tools such as Google Cloud Platform, Azure, Amazon Web Services, and Heroku.

While you can opt to expose the model in Pub/Sub way, API (Application Program Interface) or REST wrapper is more commonly used to deploy the model in production. As the model complexity increases, different teams are assigned to handle such situations commonly known as Machine Learning Engineers. With this much introduction, let’s look at how to deploy a machine learning model as an API on the Heroku platform.

What is Heroku?

Heroku is a Platform as a service tool that allows developers to host their serverless code. What this means is that one can develop scripts to serve one or the other for specific purposes. The Heroku platform is itself hosted on AWS (Amazon Web Services), which is an infrastructure as a service tool.

The Heroku is a free platform but limited to 500hrs uptime. The apps are hosted as a dyno which after inactivity of 30 minutes goes into sleep mode. This ensures that your app is not consuming all the free time during inactivity. The platform supports Ruby, Java, PHP, Python, Node, Go, Scala. Most Data Science beginners refer to this platform to have an experience of running and deploying a model in the cloud.

Preparing the Model

Now that you are aware of this platform, let’s prepare the model for the same. When a machine learning model is trained, the corresponding parameters are stored in the memory itself. This model needs to be exported in a separate file so we can directly load this model, pass unseen data, and get the outputs.

Different model formats are usually practiced such as Pickle, job-lib which converts the Python Object Model into a bitstream, ONNX, PMML, or MOJO which is an H20.ai export format and offers the model to be integrated into Java applications too. For simplicity, consider that we want to export the model via pickle then you can do it by:

import pickle

Pkl_Filename = “model.pkl”

with open(Pkl_Filename, ‘wb’) as file:

pickle.dump(model_name, file)

The model is now stored in a separate file and ready to be used in integrated into an API.

The Server logic

For providing access to this model for predictions, we need a server code that can redirect and handle all client-side requests. Python supports web development frameworks and a famous one is Flask.

It is a minimalistic framework that allows to set up a server with few lines of code. As it is a minimal package, a lot of functionalities such as authentication and RESTful nature are not explicitly supported. These can be integrated with extensions.

Another option is to opt for the newly released framework FastAPI. It is much faster, scalable, well documented, and comes with a lot of integrated packages. For now, let’s continue with the flask to set up a simple prediction route.

from flask import Flask

import pickle

app = Flask(__name__)

with open(Filename, ‘rb’) as file:

model = pickle.load(file)

@app.route(‘/predict’, methods = [‘GET’, ‘POST’])

def pred():

# implement the logic to get parameters either through query or payload

prediction = model.predict([parameters obtained])

return {‘result’: prediction}

This is a rough code to show how to proceed with the server logic. There are various strategies you can opt for better implementation.

Check Out: Guide to Deploying ML Models Using Streamlit

Setting up Deployment Files

Heroku requires a list of all dependencies required by our application. This is called the requirements file. It is a text file listing all the external libraries the application uses. In this example, the file contents would contain:

flask

sklearn

numpy </p>

pandas

gunicorn

The last library, gunicorn allows us to set up the WSGI server implementation that forms the interface for the client and the server handling the HTTP traffic. Heroku also demands another file known as Procfile that is used to specify the entry point of the app. Consider that the server logic file is saved by the name main.py, then the command to be put in this file is:

web: gunicorn main:app

“web” is the type of dyno we are deploying, “gunicorn” act as the mediator to pass the request to the server code “main” and search for “app” in “main”. The app handles all the routes here.

Final Deployment

All the preparations are done, and now it’s time to run the app in the cloud. Create an account if not on the Heroku, click on create an app, choose any region. After that connect your GitHub account, and choose the repo that contains these files: server code, model.pkl, requirements.txt, and Procfile.

After this simply hit deploy branch! If it’s successful, then visit the link generated and your app should be live. Now you can make requests to appname.herokuapp.com/predict route and it should give out the predictions. Learn more about machine learning models.

Conclusion

This was an introduction to what is Heroku, why it is required, and how to deploy a model with the help of Flask. There are a lot of hosting platforms that offer more advanced features such as Data Pipelines, streaming, but Heroku being the free platform is still a good choice for beginners who just want to have a taste of deployment.

If you’re interested to learn more about machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.