Cloud API bills add up fast. You send a few thousand requests through GPT-4 or Claude, wire up some automation workflows, and suddenly you're looking at a $200/month invoice for something that runs twice a day. Worse, you're sending your business data (customer names, internal docs, proprietary processes) to someone else's servers every single time.

A homelab fixes both problems. For the price of two months of cloud API bills, you can own hardware that runs local models, hosts your automation stack, and gives you a private AI infrastructure that never phones home. The models have caught up. A 7B parameter model running on a $300 mini PC can handle classification, summarization, extraction, and basic reasoning at speeds that are fine for automation. You don't need GPT-4 for every task.

This guide walks through building an AI agent homelab from bare hardware to running autonomous workflows. Real hardware picks with real prices, a tested software stack, network architecture that works from anywhere, and actual configs you can paste into your terminal.

Who this is for: Developers, sysadmins, and tinkerers who want to run AI agents on hardware they own. You should be comfortable with Linux, Docker, and SSH. No ML background needed. We're using pre-trained models, not training them.

01 Hardware: What to Buy

Your hardware choice depends on one question: do you want to run local LLMs, or just host the orchestration layer and call cloud APIs? Both are valid. The orchestration-only path costs a third as much.

Option A: Orchestration-only (~$150-300)

If you're calling OpenAI, Anthropic, or Groq APIs and just need somewhere to run n8n, OpenClaw, databases, and scheduled jobs, almost anything works. A Raspberry Pi 5 (8GB) handles it. An Intel N100 mini PC handles it better. You don't need GPU, and you don't need much RAM.

The N100 sips 10 watts at idle. Your electric bill won't notice it exists.

Option B: Local LLM capable (~$400-800)

Running models locally requires more RAM than anything else. The model gets loaded into memory, and if it doesn't fit, you're swapping to disk and inference takes minutes instead of seconds. Rule of thumb: you need roughly 1GB of RAM per billion parameters at Q4 quantization.

For a homelab that runs 7B-13B models locally while hosting your full automation stack:

Don't overlook used enterprise gear. A refurbished Lenovo ThinkCentre M920q Tiny with a 9th-gen i5 and 32GB RAM sells for $120-150 on eBay. It won't run models, but as an orchestration node it's overkill.

Option C: Proxmox virtualization host (~$300-600)

If you want to isolate workloads (and you should), run Proxmox VE as your hypervisor. One physical box, multiple VMs: one for AI/Ollama, one for automation (n8n, OpenClaw), one for monitoring. If a rogue automation script eats all the RAM, your other VMs keep running.

Any of the machines above work as a Proxmox host. The AMD Ryzen options are better here because their iGPU can be passed through to a VM for hardware-accelerated inference, while the host runs headless.

02 Operating System & Virtualization

Install Proxmox VE if you want VM isolation. Install Ubuntu Server 24.04 LTS if you want simplicity. Both work. Proxmox adds overhead but gives you snapshots, live migration (if you later add a second node), and clean separation between workloads.

Proxmox setup (recommended)

# Download Proxmox VE 8.x ISO from proxmox.com
# Flash to USB with balenaEtcher or dd
# Boot, install, set a static IP

# After install, disable the enterprise repo (unless you have a subscription):
echo "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription" \
  > /etc/apt/sources.list.d/pve-no-subscription.list

apt update && apt dist-upgrade -y

Create your first VM for the AI/automation workload:

# From the Proxmox web UI (https://your-ip:8006):
# → Create VM → Ubuntu Server 24.04 ISO
# → 4 cores, 16GB RAM (adjust to your hardware)
# → 100GB disk (thin provisioned)
# → Start after creation

Bare metal Ubuntu (alternative)

If you want things running in 15 minutes instead of 45, skip Proxmox and install Ubuntu Server directly. You lose VM isolation but gain simplicity. Good for single-purpose boxes.

# After installing Ubuntu Server 24.04:
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl git htop tmux

03 The Software Stack

Here's the full stack, from bottom to top. Every piece is open source or has a generous free tier. Total cost for software: $0.

Docker & Docker Compose

Everything runs in containers. No exceptions. Containers give you reproducible deployments, easy rollbacks, and clean dependency isolation.

# Install Docker (official method):
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER

# Log out and back in, then verify:
docker run hello-world

Ollama — Local LLM inference

Ollama turns running local models into a single command. It handles quantization, GPU detection, memory management, and exposes an OpenAI-compatible API. If your app works with GPT, it works with Ollama by changing one URL.

# Install Ollama:
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model:
ollama pull llama3:8b
ollama pull mistral:7b
ollama pull nomic-embed-text  # for embeddings

# Test it:
ollama run llama3:8b "Summarize the benefits of running AI locally in 3 sentences."

# Ollama API is now running on localhost:11434
# OpenAI-compatible endpoint: http://localhost:11434/v1/chat/completions

Ollama automatically uses your GPU if it detects one. On CPU-only systems, expect 5-15 tokens/second for 7B models — slow for chat, but perfectly fine for batch processing and automation.

n8n — Workflow automation

n8n is the backbone of your automation layer. It connects to 400+ services, has a visual workflow editor, and supports custom JavaScript/Python nodes. Think Zapier, but self-hosted and with no per-execution pricing.

# docker-compose.yml for n8n:
services:
  n8n:
    image: n8nio/n8n:latest
    restart: always
    ports:
      - "5678:5678"
    environment:
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=admin
      - N8N_BASIC_AUTH_PASSWORD=your-secure-password
      - N8N_HOST=n8n.yourdomain.com
      - N8N_PROTOCOL=https
      - GENERIC_TIMEZONE=America/New_York
    volumes:
      - n8n_data:/home/node/.n8n

volumes:
  n8n_data:
docker compose up -d

n8n connects to Ollama natively via its "Chat Model" node. Point it at http://localhost:11434 and select your model. Now your workflows can classify emails, extract data from PDFs, generate responses, and make decisions. All running on your hardware.

OpenClaw — AI assistant gateway

OpenClaw gives you a persistent AI assistant that connects to Telegram, Discord, or any chat platform. It runs cron jobs, manages tools, and maintains memory across sessions. It's the "agent brain" that ties your homelab together.

# Install OpenClaw:
npm install -g openclaw

# Initialize configuration:
openclaw init

# Start the gateway:
openclaw gateway start

OpenClaw can use Ollama as its model backend, or it can call cloud APIs for tasks that need stronger reasoning (Claude, GPT-4). The pattern that works: use local models for high-volume, low-complexity tasks (classification, extraction, summarization), and route complex reasoning to cloud APIs on demand.

PostgreSQL — Structured data store

Your agents need somewhere to store state, results, and historical data. SQLite works for single-agent setups. PostgreSQL works for everything else.

# Add to your docker-compose.yml:
  postgres:
    image: postgres:16
    restart: always
    environment:
      POSTGRES_USER: agent
      POSTGRES_PASSWORD: your-secure-password
      POSTGRES_DB: homelab
    ports:
      - "5432:5432"
    volumes:
      - pgdata:/var/lib/postgresql/data

volumes:
  pgdata:

Monitoring — know when things break

Agents that run unattended need monitoring. Without it, a failed workflow runs silently for weeks before you notice the data pipeline is stale.

# Minimal monitoring stack (add to docker-compose.yml):
  uptime-kuma:
    image: louislam/uptime-kuma:latest
    restart: always
    ports:
      - "3001:3001"
    volumes:
      - uptime_data:/app/data

Uptime Kuma monitors your services and sends alerts via Telegram, Discord, email, or 20+ other channels. Set up checks for Ollama (http://localhost:11434), n8n (http://localhost:5678), and any custom endpoints your agents expose.

Want 12 ready-to-import n8n workflows?

The AI Automation Starter Kit includes 12 production-tested n8n workflows — lead scoring, content repurposing, email triage, web scraping, and more. Each one works with Ollama or cloud APIs. Import, configure, run.

Get the starter kit — $39

04 Network Architecture

Your homelab sits behind a NAT. Your laptop is at a coffee shop. Your phone is on cellular. You need secure access to your agents from anywhere without exposing ports to the internet.

Tailscale — mesh VPN

Tailscale creates a private network across all your devices using WireGuard. Every device gets a stable IP. No port forwarding, no dynamic DNS, no firewall holes. It's the single best thing you can install on a homelab.

# Install on your homelab server:
curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up

# Install on your laptop/phone too.
# Now your homelab is reachable at its Tailscale IP from anywhere.

Access n8n at http://100.x.x.x:5678, Ollama at http://100.x.x.x:11434, and Proxmox at https://100.x.x.x:8006 from any device on your tailnet. No public exposure.

Tailscale Serve — HTTPS without the hassle

If you want proper HTTPS with real certificates (useful for OpenClaw's webhook endpoints and n8n's OAuth callbacks):

# Expose n8n over HTTPS on your tailnet:
tailscale serve --bg https+insecure://localhost:5678

# Expose OpenClaw gateway:
tailscale serve --bg --https 443 http://localhost:18789

Now you have valid HTTPS certificates, automatic renewal, and zero attack surface. The ports are only reachable from your tailnet.

Reverse proxy for multiple services

If you're running 5+ services and want clean URLs, add Caddy as a reverse proxy:

# Caddyfile
n8n.homelab.local {
    reverse_proxy localhost:5678
}

ollama.homelab.local {
    reverse_proxy localhost:11434
}

monitor.homelab.local {
    reverse_proxy localhost:3001
}

Combined with Tailscale's MagicDNS, you get https://homelab.tail-abc123.ts.net URLs that resolve automatically on every device in your tailnet.

05 Building Your First Agent Workflow

Hardware is running. Software is installed. Now make it do something useful. Here's a real workflow: an inbox triage agent that reads incoming emails, classifies them by urgency and category, drafts responses for routine ones, and alerts you about anything that needs human attention.

The architecture

  1. Trigger: n8n polls your email inbox every 5 minutes (IMAP node)
  2. Classify: Send the email subject + first 500 chars to Ollama (Llama 3 8B) with a classification prompt
  3. Route: Based on the classification: urgent goes to Telegram alert, routine gets a draft reply, spam gets archived
  4. Store: Log every email and its classification to PostgreSQL for pattern analysis
  5. Learn: Weekly cron job analyzes the PostgreSQL data and updates the classification prompt with new patterns

The classification prompt is the key piece:

You are an email classifier. Categorize the following email into exactly one category:
- URGENT: requires human response within 4 hours
- ROUTINE: standard business email, can be auto-drafted
- FYI: informational, no response needed
- SPAM: promotional or unsolicited

Respond with ONLY the category name, nothing else.

Subject: {{$json.subject}}
From: {{$json.from}}
Body: {{$json.body.substring(0, 500)}}

Llama 3 8B handles this reliably at 10+ classifications per second on CPU. It's a constrained output task, exactly what smaller models excel at.

More workflow ideas

Each of these runs 24/7 on your homelab, costs nothing per execution, and keeps your data local.

06 Security Considerations

Your homelab is running agents that read your email, access your APIs, and make decisions. Secure it accordingly.

Basics that matter

Agent-specific security

# Enable UFW and lock down to Tailscale:
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow in on tailscale0
sudo ufw enable

# Now only Tailscale traffic reaches your services

07 Maintenance & Scaling

A homelab that works on day one but breaks on day thirty is useless. Build maintenance into the system.

Backups

Proxmox has built-in backup scheduling. For Docker volumes, use a cron job:

# Backup all Docker volumes nightly:
#!/bin/bash
BACKUP_DIR=/mnt/backup/docker-volumes
DATE=$(date +%Y%m%d)

for vol in $(docker volume ls -q); do
    docker run --rm -v "$vol":/source -v "$BACKUP_DIR":/backup \
        alpine tar czf "/backup/${vol}_${DATE}.tar.gz" -C /source .
done

# Keep 7 days of backups:
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +7 -delete

Model updates

New model releases drop monthly. Update when it matters, not compulsively:

# Check for model updates:
ollama list

# Pull a newer version:
ollama pull llama3:8b

# Test before swapping your production workflows
ollama run llama3:8b "Test classification prompt here"

Scaling up

When one box isn't enough:

What We Didn't Cover

This guide gets you from zero to a running AI agent homelab. But a production-grade setup goes deeper:

Get the Infrastructure Guides Bundle

Step-by-step guides for Proxmox, Docker, Tailscale, and monitoring — the full infrastructure layer under your AI agents. Includes GPU passthrough, backup automation, and multi-node clustering.

Download the bundle — $24

Start Small, Build Up

You don't need to build the whole stack on day one. Start with a $160 mini PC running Docker, n8n, and one automation workflow. Get that working, get it useful, then add Ollama. Then add Proxmox when you outgrow the single-box setup. Then add a second node when your homelab addiction truly takes hold.

The point isn't to build the perfect infrastructure. The point is to own your AI stack, control your data, and stop paying per-API-call for tasks a local model handles fine. Every workflow you move to your homelab is one that runs forever at zero marginal cost.

The hardware is cheap. The software is free. The only cost is your time. And if you're reading a homelab blog post, you were probably going to spend that time tinkering anyway.