What you’ll learn
In this tutorial, you’ll learn how to:- Deploy a Pod with the PyTorch template.
- Install and configure Ollama for external access.
- Run AI models and interact via the HTTP API.
Requirements
- A Runpod account with credits.
Step 1: Deploy a Pod
- Navigate to Pods and select Deploy.
- Choose a GPU (for example, A40).
- Select the latest PyTorch template.
- Under Pod Template, select Edit:
- Under Expose HTTP Ports (Max 10), add port
11434. - Under Environment Variables, add an environment variable with key
OLLAMA_HOSTand value0.0.0.0.
- Click Set Overrides and then Deploy On-Demand.
Step 2: Install Ollama
- Once the Pod is running, click the Pod to open the connection options panel and select Enable Web Terminal and then Open Web Terminal.
-
Update packages and install dependencies:
-
Install Ollama and start the server in the background:
Step 3: Run a model
Download and run a model using theollama run command:
llama2 with any model from the Ollama library. You can now interact with the model directly from the terminal.
Step 4: Make HTTP API requests
With Ollama running, you can make HTTP requests to your Pod from external clients. Try running the following commands, replacingOLLAMA_POD_ID with your actual Pod ID:
List available models:
stream: false parameter to the request body:
Next steps
- Learn about exposing ports on Pods.
- Connect VSCode to Runpod for remote development.
- Explore more models in the Ollama library.