Serverless
Create an image generation endpoint with Serverless
Deploy a Stable Diffusion endpoint and generate your first AI image using Serverless.
Integrate Serverless with a web application
Deploy an image generation endpoint and integrate it into a web application.
Deploy a cached model
Learn how to create a custom Serverless endpoint that uses model caching to serve a large language model with reduced cost and cold start times.
Deploy a chatbot with Gemma 3 and send requests using the OpenAI API
Deploy a Serverless endpoint with Google’s Gemma 3 model using vLLM and the OpenAI API to build an interactive chatbot.
Generate images with ComfyUI on Serverless
Deploy ComfyUI on Serverless and generate images using JSON workflows.
Pods
Run LLM inference on Pods with JupyterLab
Launch JupyterLab on a GPU Pod and run LLM inference using the Python
transformers library.Pods + Ollama
Deploy Ollama on a GPU Pod and run LLM inference using the Ollama API.
Build Docker images on Pods using Bazel
Build Docker images on Pods using Bazel.
Generate images with ComfyUI on Pods
Deploy ComfyUI on a GPU Pod and generate images using the ComfyUI web interface.