Distributed Inference with PyTorch and Celery in Python

Lean framework for distributed applications

Feb 17, 2023

Deep Learning models are becoming increasingly larger and more complex, requiring powerful GPUs to train and infer. However, not everyone has access to these resources, and even those who do, may have to wait a long time for a single model to be trained. That's where distributed computing comes in handy. In this article, we'll explore how to use Celery in Python to perform distributed inference with PyTorch.

What is Celery?

Celery is an open-source distributed task queue that allows developers to run asynchronous and distributed tasks. It works by breaking down a large task into smaller, independent units of work that can be executed in parallel on multiple machines or CPUs.

What is PyTorch?

PyTorch is an open-source machine learning library based on the Torch library. It provides a set of tools and libraries that enable developers to create and train deep learning models.

How to Use Celery with PyTorch for Distributed Inference

To use Celery with PyTorch, we'll need to define a Celery task that loads the PyTorch model and performs the inference. Let's take a look at how we can do this.

Define the Celery task The first step is to define a Celery task that loads the PyTorch model and performs the inference. Here's an example task that takes an image file path as input and returns the predicted class:

import torch
import torchvision.transforms as transforms
from PIL import Image

from celery import Celery

app = Celery('tasks', broker='amqp://guest@localhost//')

class PyTorchTask(app.Task):
    def run(self, img_path):
        # Load the PyTorch model
        model = torch.load('model.pth', map_location='cpu')
        model.eval()

        # Load the image
        img = Image.open(img_path)

        # Preprocess the image
        preprocess = transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize(
                mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225]
            )
        ])
        img_tensor = preprocess(img)

        # Perform the inference
        with torch.no_grad():
            output = model(img_tensor.unsqueeze(0))
            _, predicted = torch.max(output.data, 1)

        # Return the predicted class
        return predicted.item()

Start the Celery worker The next step is to start a Celery worker that will execute the tasks. You can start a Celery worker by running the following command in your terminal:

celery -A tasks worker --loglevel=info

This command will start a Celery worker that will listen for tasks and execute them.

Submit the task for execution Finally, we can submit a task for execution by calling the task's .delay() method. Here's an example:

from tasks import PyTorchTask

result = PyTorchTask.delay('/path/to/image.jpg')
print(result.get())

This code will submit a task to the Celery worker to perform the inference on the image located at /path/to/image.jpg. The .get() method will block until the task is completed and return the predicted class.

Distributed computing is becoming increasingly popular, especially in the field of deep learning, where models can be incredibly large and complex. Celery is a powerful tool that allows developers to easily perform distributed tasks in Python. In this article, we explored how to use Celery with PyTorch to perform distributed inference. By breaking

Chetan’s Substack

Discussion about this post