Model Inference Service

This service is responsible for running Cellarium Cloud model inference on a given dataset. It is deployed as a Cloud Run service.

Requirements

  • Python 3.10

  • Pytorch 2.0.1

  • Database connection

  • .env file with the secret variables

Building Docker Image

To build the Docker image locally and push it to the registry, run the following commands:

IMAGE_NAME=docker-image.dev/example # Name of the docker image
docker build -t $IMAGE_NAME -f ./src/casp/services/deploy/Dockerfile.pytorch .
docker push $IMAGE_NAME

Deploying Docker Image via Cloud Run

To deploy the Docker image using Cloud Run run (see Cloud Run Documentation for more information)

SERVICE_NAME=cellarium-cloud-model-inference # Name of the service
IMAGE_NAME=docker-image.dev/example # Name of the docker image
PROJECT_ID=example-project # GCP Project ID
NUM_CPUS=4 # Number of CPUs per deployed instance
MEMORY=16Gi # Memory per deployed instance; Can't be less than 256Mi and more than 4Gi per one CPU core
REGION=us-central1 # Region where the service will be deployed
PLATFORM=managed # Target platform to run the service. Choices: managed, gke, kubernetes
PORT=8000 # Port which the running image will listen to (matches the FastAPI port).
DB_CONNECTION=example-project:us-region-example:db-cluster-name # Cloud SQL connection name
TIMEOUT=1100 # Request timeout in seconds
MAX_INSTANCES=500 # Maximum number of instances to scale to
MIN_INSTANCES=0 # Minimum number of instances to scale to. If 0, the service will have a "cold start"
CONCURRENCY=10 # Maximum number of requests that can be served at the same time per instance

gcloud run deploy $SERVICE_NAME \
--project=$PROJECT_ID \
--image=$IMAGE_NAME \
--memory=$MEMORY \
--cpu=$NUM_CPUS \
--region=$REGION \
--port=$PORT \
--add-cloudsql-instances=$DB_CONNECTION \
--timeout=$TIMEOUT \
--max-instances=$MAX_INSTANCES \
--min-instances=$MIN_INSTANCES \
--concurrency=$CONCURRENCY \
--command python --args "casp/services/model_inference/main.py" \
--allow-unauthenticated \
--platform=managed