How to Deploy Machine Learning Models Using Fast API | Deploy Machine Learning Models | 1

How to Deploy Machine Learning Models Using Fast API

One of the biggest challenges for data scientists is to deploy their models. What use if other services or users can’t consume their perfect model?

But if you know some Python, you can quickly wrap your models around an API endpoint. That’s the focus of this post.

If you know some Python web technologies, you’d know Django and Flask. I’m personally a fan of these frameworks. Yet, for this tutorial, I’m going to use Fast API. And I’d recommend Fast API for production uses also.

We’ll first discuss why Fast API, then deploy a Scikit-learn model. Then, we’ll also take a look at how to wrap another ML service with an API endpoint. Finally, we’ll discuss some ways you can secure your endpoints.

Grab your aromatic coffee (or tea) and get ready…!

Why is Fast API the best to deploy ML models in production?

FastAPI is a popular choice for deploying machine learning models because it is fast, efficient, and easy to use. Here are a few reasons why FastAPI is often preferred over other web frameworks like Flask and Django for deploying machine learning models:

  1. FastAPI is built on top of Starlette, which is a lightweight ASGI framework. This means that it can handle high levels of concurrency and is well-suited for real-time predictions.
  2. FastAPI uses type hints to validate API functions’ input and output, making it easy to ensure that your API is well-documented and easy to use.
  3. FastAPI has built-in support for automatic API documentation usingOpenAPI and Swagger, which makes it easy for developers to understand how to use your API.
  4. FastAPI has excellent support for asynchronous programming, which is essential when working with machine learning models that can take a long to process requests.

FastAPI is a powerful and easy-to-use web framework well-suited for building and deploying machine learning models in production. While Flask and Django are also popular web frameworks, they may not be as well-suited for this specific use case due to their focus on different features and goals.

Deploy a Scikit-learn model using Fast API.

To deploy a scikit-learn model with FastAPI, you can follow these steps:

1. You’ll need to install FastAPI and other necessary libraries, such as Scikit-learn and NumPy. You can do this by running the following command:

pip install fastapi scikit-learn numpy
Bash

2. Next, you’ll need to train your scikit-learn model and save it to a file. You can do this using the pickle library in Python.

import pickle

# train your model
model = DecisionTreeClassifier()
model.fit(X, y)

# save the model to a file
with open("model.pkl", "wb") as f:
    pickle.dump(model, f)
Bash

In the above code, X and y are the training dataset and its labels. Both of them are Numpy arrays. Reading and preparing the dataset has been ignored here for simplicity.

3. Once your trained model is saved to a file, you can create a FastAPI application and define an endpoint for your API.

from fastapi import FastAPI

app = FastAPI()

@app.post("/predict")
def predict(data: List[float]):
    # code to make prediction goes here
    return prediction
Bash

4. To make a prediction using your scikit-learn model, you’ll need to load the model from the file and use it to predict the input data.

import pickle
import numpy as np

def predict(data: List[float]):
    with open("model.pkl", "rb") as f:
        model = pickle.load(f)
    data = np.array(data).reshape(1, -1)
    prediction = model.predict(data)
    return prediction
Bash

5. You can then return the prediction in the API response. You can customize the format of the response to suit your needs.

from fastapi import FastAPI

app = FastAPI()

@app.post("/predict")
def predict(data: List[float]):
    prediction = make_prediction(data)
    return {"prediction": prediction}
Bash

6. Finally, you can start the FastAPI application and test your API using a tool like Postman or cURL.

uvicorn main:app --reload
Bash

How to test your Fast API deployment?

To test that your FastAPI API is working as expected, you can use a tool like Postman or cURL to send requests to your API and verify that you receive the expected response.

For example, if you are using Postman, you can send a POST request to the endpoint of your API (e.g., http://localhost:8000/predict) and include any necessary request body or query parameters. Then, you can verify that the response you receive is what you were expecting.

If you are using cURL, you can use the following command to send a POST request to your API:

curl -X POST http://localhost:8000/predict -d '{"data": [1.0, 2.0, 3.0]}'
Bash

You can also write unit tests for your API to automate the testing process. This can be helpful if you ensure your API is working correctly after making changes or updates to your code.

How to wrap up your external models with a REST API endpoint?

Often, you’d be using third-party models. A good example is AWS comprehend. You can use some of the pre-trained models in AWS Comprehend for tasks such as topic modeling.

Yet, for various reasons, you might want to restrict direct access to these services. In such situations, the best way is to wrap it up using a REST API endpoint.

Fast API comes in handy here.

To create a topic modeling API using AWS Comprehend and FastAPI, you can follow these steps:

1. First, you’ll need to sign up for an AWS account and set up the AWS CLI on your local machine.

2. Next, you’ll need to install the necessary Python libraries for interacting with AWS Comprehend and FastAPI. You can do this by running the following command:

pip install boto3 fastapi
Bash

3. Once you have the necessary libraries installed, you can create a FastAPI application and define the endpoint for your topic modeling API. You can do this by writing the following code:

from fastapi import FastAPI

app = FastAPI()

@app.post("/topics")
def get_topics(text: str):
    # code to generate topics goes here
    return topics
Bash

4. To generate topics using AWS Comprehend, you’ll need to use the detect_topics() function from the boto3 library. You’ll need to pass in the text you want to generate topics for and your AWS access keys and region.

import boto3

comprehend = boto3.client(
    "comprehend",
    aws_access_key_id=ACCESS_KEY,
    aws_secret_access_key=SECRET_KEY,
    region_name=REGION
)

def get_topics(text: str):
    response = comprehend.detect_topics(
        Text=text,
        LanguageCode="en"
    )
    topics = response['Topics']
    return topics
Bash

5. You can then return the topics in the API response. You can customize the format of the response to suit your needs.

from fastapi import FastAPI

app = FastAPI()

@app.post("/topics")
def get_topics(text: str):
    topics = generate_topics(text)
    return {"topics": topics}
Bash

6. Finally, you can start the FastAPI application and test your API using a tool like Postman or cURL.

uvicorn main:app --reload
Bash

How to authenticate and secure FastAPI endpoints

There are many ways to authenticate FastAPI endpoints, and the best approach will depend on your specific use case and requirements. Here are a few options that you might consider:

  1. Basic Authentication: You can use basic authentication to protect your API by requiring clients to provide a username and password with each request. FastAPI provides a built-in BasicAuth dependency that you can use to implement this type of authentication.
  2. OAuth2: OAuth2 is a widely-used authorization framework that allows users to grant third-party applications access to their resources without sharing their passwords. FastAPI provides a built-in OAuth2PasswordBearer dependency that you can use to implement OAuth2 authentication.
  3. JWT: JSON Web Tokens (JWTs) are a popular way to authenticate API requests. With JWT authentication, you can issue a JWT to a client after they provide their credentials and then require that the client include the JWT in the Authorization header of each subsequent request. FastAPI provides a built-in JWT dependency that you can use to implement JWT authentication.
  4. Custom authentication: If you have specific authentication requirements not met by the built-in dependencies, you can create your own custom authentication solution. This might involve creating your own middleware or using a third-party library like Passlib to handle the authentication process.

Final Thought

Fast API is a beautiful Python tool for creating REST APIs faster. While minimal to quick uses, it’s feature-rich and comparable to other mature libraries and frameworks such as Flask and Django.

This post discusses ways to wrap your ML models, including external ones, in an API endpoint. REST APIs are the most popular way to deploy ML models in production.


Thanks for the read, friend. It seems you and I have lots of common interests. Say Hi to me on LinkedIn, Twitter, and Medium.

Not a Medium member yet? Please use this link to become a member because I earn a commission for referring at no extra cost for you.

Similar Posts