Why Your Vibe-Coded App Breaks in Production (And How to Fix It)

You built something with Cursor. Or Claude Code. Or bolt.new. It works on your laptop. The demo looks good. You push it to a VPS or Railway or Fly.io and within 48 hours, something breaks. Maybe the app crashes silently. Maybe your Stripe webhook stops processing. Maybe you wake up to a $400 cloud bill because your app kept spawning connections to a database that wasn't there.

This is the gap between localhost and production that AI coding tools don't warn you about. The code runs, but it's not production-ready. It's missing the boring infrastructure that keeps software alive under real conditions: error handling, health checks, graceful shutdown, logging, rate limiting.

Our previous guide covered getting your vibe-coded app deployed and security-scanned. This post picks up where that one left off. We're covering the eight most common production failures in AI-generated code, with real examples and fixes you can apply today.

01 Environment Variable Leaks

AI coding tools love to hardcode secrets. You tell Claude "add Stripe integration" and it generates a file with const STRIPE_KEY = "sk_test_..." right in the source. Or it creates a .env file and doesn't add it to .gitignore. Or it reads from process.env but falls back to a hardcoded default that includes a real API key from the training data.

This isn't hypothetical. GitHub's secret scanning team reported a 28% increase in exposed API keys in 2025, with a significant portion traced to AI-assisted repositories.

The fix

First, audit every file in your project for hardcoded secrets:

# Find hardcoded API keys, tokens, passwords
grep -rn "sk_live\|sk_test\|AKIA\|ghp_\|password\s*=" --include="*.js" --include="*.ts" --include="*.py" .

# Check if .env is tracked by git
git ls-files | grep -i env

Then enforce the pattern at the code level. Every secret should come from an environment variable with no fallback value. If the variable is missing, the app should crash on startup with a clear error:

// Good: crash if missing
const STRIPE_KEY = process.env.STRIPE_SECRET_KEY;
if (!STRIPE_KEY) {
  console.error("FATAL: STRIPE_SECRET_KEY not set");
  process.exit(1);
}

// Bad: silent fallback
const STRIPE_KEY = process.env.STRIPE_SECRET_KEY || "sk_test_default";

Add a validation function that runs at startup and checks every required variable at once. Fail fast, fail loud.

function validateEnv(required) {
  const missing = required.filter(key => !process.env[key]);
  if (missing.length > 0) {
    console.error(`Missing env vars: ${missing.join(", ")}`);
    process.exit(1);
  }
}

validateEnv([
  "DATABASE_URL",
  "STRIPE_SECRET_KEY",
  "SESSION_SECRET",
  "REDIS_URL"
]);

02 Missing Error Handling

AI-generated code tends to follow the happy path. Ask it to build an API endpoint and you'll get the success case, beautifully structured. But what happens when the database is down? When the external API returns a 500? When the request body is malformed JSON?

Usually: an unhandled exception crashes the process. In Node.js, that means the entire server goes down. In Python, the worker dies and depending on your setup, it might not respawn.

The fix

Wrap every external call (database queries, API requests, file reads) in error handling. Not generic catches that swallow everything, but specific handlers that log the error and return an appropriate response:

// Express example
app.post("/api/payments", async (req, res) => {
  try {
    const { amount, currency } = req.body;

    if (!amount || !currency) {
      return res.status(400).json({ error: "amount and currency required" });
    }

    const charge = await stripe.charges.create({ amount, currency });
    res.json({ id: charge.id, status: charge.status });
  } catch (err) {
    if (err.type === "StripeCardError") {
      return res.status(402).json({ error: err.message });
    }
    console.error("Payment failed:", err.message, err.stack);
    res.status(500).json({ error: "Payment processing failed" });
  }
});

Add a global error handler as a safety net. This catches anything your route handlers miss:

// Global error handler (Express) - must have 4 params
app.use((err, req, res, next) => {
  console.error(`Unhandled error: ${req.method} ${req.path}`, err.stack);
  res.status(500).json({ error: "Internal server error" });
});

// Catch unhandled promise rejections
process.on("unhandledRejection", (reason, promise) => {
  console.error("Unhandled rejection at:", promise, "reason:", reason);
});

// Catch uncaught exceptions
process.on("uncaughtException", (err) => {
  console.error("Uncaught exception:", err);
  process.exit(1); // let the process manager restart you
});

03 No Health Checks

Your app is running. Is it working? Without a health check endpoint, your load balancer, container orchestrator, or uptime monitor has no way to know. The process might be alive but deadlocked. The database connection might be stale. The event loop might be blocked.

AI tools almost never generate health check endpoints. They're not part of the "build me a todo app" prompt. But every production service needs one.

The fix

Add two endpoints: a shallow health check (is the process alive?) and a deep health check (are the dependencies working?):

// Shallow - for load balancer probes
app.get("/healthz", (req, res) => {
  res.status(200).json({ status: "ok", uptime: process.uptime() });
});

// Deep - checks actual dependencies
app.get("/readyz", async (req, res) => {
  const checks = {};
  let healthy = true;

  // Check database
  try {
    await db.query("SELECT 1");
    checks.database = "ok";
  } catch (err) {
    checks.database = err.message;
    healthy = false;
  }

  // Check Redis
  try {
    await redis.ping();
    checks.redis = "ok";
  } catch (err) {
    checks.redis = err.message;
    healthy = false;
  }

  const status = healthy ? 200 : 503;
  res.status(status).json({ status: healthy ? "ok" : "degraded", checks });
});

Configure your platform to use these. On Railway, set the health check path. On Docker, add a HEALTHCHECK instruction. On Kubernetes, configure liveness and readiness probes. On a plain VPS, point your uptime monitor (UptimeRobot, Hetrixtools) at /healthz.

04 No Logging

Console.log is not logging. AI-generated apps use console.log for debugging during development and that's what ships to production. No timestamps, no log levels, no structured format, no way to search or filter.

When something breaks at 3 AM, you need to reconstruct what happened. "User 4521 hit /api/orders at 03:14:22, got a 500, the database query timed out after 30 seconds." You can't do that with console.log("error:", err).

The fix

Use a structured logger. In Node.js, pino is the fastest option. In Python, structlog or the built-in logging module with JSON formatting:

// Node.js with pino
const pino = require("pino");
const logger = pino({
  level: process.env.LOG_LEVEL || "info",
  transport: process.env.NODE_ENV === "development"
    ? { target: "pino-pretty" }
    : undefined
});

// Usage
logger.info({ userId: user.id, path: req.path }, "Request received");
logger.error({ err, orderId }, "Order processing failed");

# Python with structlog
import structlog

logger = structlog.get_logger()

logger.info("request_received", user_id=user.id, path=request.path)
logger.error("order_failed", order_id=order_id, error=str(e))

Structured logs (JSON format) let you pipe output to Loki, Datadog, CloudWatch, or any log aggregator. You can query "show me all errors for user 4521 in the last hour" instead of grepping through megabytes of unstructured text.

Add request logging middleware so every HTTP request gets a log entry with method, path, status code, and duration:

app.use((req, res, next) => {
  const start = Date.now();
  res.on("finish", () => {
    const duration = Date.now() - start;
    logger.info({
      method: req.method,
      path: req.path,
      status: res.statusCode,
      duration_ms: duration
    }, "http_request");
  });
  next();
});

Get the Full Production Readiness Checklist

This post covers the eight biggest failures. The Vibe Coding Deployment Guide includes 60+ checks organized by priority, with copy-paste configs for Docker, Railway, Fly.io, and plain VPS deployments. Includes a pre-deploy script that catches these issues before they hit production.

Get the deployment guide — $34 →

05 Hardcoded Localhost URLs

This one is embarrassingly common. The AI generates fetch("http://localhost:3000/api/data") in your frontend code. Or mongoose.connect("mongodb://localhost:27017/mydb") in your backend. On your machine, it works. In production, it's connecting to... nothing. Or worse, to whatever happens to be running on port 3000 of your production server.

Some variations are subtle. The AI might hardcode http://localhost in a CORS origin list. Or set a cookie domain to localhost. Or configure a webhook callback URL pointing at 127.0.0.1.

The fix

Search your entire codebase for localhost references:

grep -rn "localhost\|127\.0\.0\.1\|0\.0\.0\.0" \
  --include="*.js" --include="*.ts" --include="*.jsx" \
  --include="*.tsx" --include="*.py" --include="*.env*" .

Every URL that changes between environments must come from configuration:

// Bad
const API_URL = "http://localhost:3000";

// Good
const API_URL = process.env.API_URL;
if (!API_URL) throw new Error("API_URL not configured");

// For frontend builds (Vite)
const API_URL = import.meta.env.VITE_API_URL;

Same for database connections, Redis URLs, webhook callbacks, CORS origins. If it contains a host or port, it belongs in an environment variable.

06 Missing Rate Limiting

AI-generated APIs ship without rate limiting. Every endpoint is wide open. Someone can hit your /api/login endpoint 10,000 times per second to brute-force passwords. Or hammer your /api/generate endpoint that calls OpenAI, running up a bill that'll make you reconsider your career choices.

In the vibe coding context, this is especially dangerous because AI tools often generate endpoints that call other paid APIs. No rate limit on your endpoint means no rate limit on your OpenAI/Anthropic/Replicate spend.

The fix

Add rate limiting at the application level. In Express:

const rateLimit = require("express-rate-limit");

// General API limit
const apiLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // 100 requests per window per IP
  standardHeaders: true,
  legacyHeaders: false,
  message: { error: "Too many requests. Try again in 15 minutes." }
});

// Strict limit for auth endpoints
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 5,
  message: { error: "Too many login attempts. Try again in 15 minutes." }
});

app.use("/api/", apiLimiter);
app.use("/api/login", authLimiter);
app.use("/api/register", authLimiter);

For endpoints that call paid external APIs, add a per-user daily limit too. Store counts in Redis so they survive restarts:

async function checkDailyLimit(userId, limit = 50) {
  const key = `usage:${userId}:${new Date().toISOString().slice(0, 10)}`;
  const count = await redis.incr(key);
  if (count === 1) await redis.expire(key, 86400);
  if (count > limit) {
    throw new Error(`Daily limit of ${limit} requests exceeded`);
  }
  return count;
}

07 No Graceful Shutdown

When you deploy a new version, the old process receives a SIGTERM signal. What happens next depends on whether your app handles it. Without a graceful shutdown handler, the process dies immediately. Any in-flight HTTP requests get dropped. Database transactions get aborted mid-write. WebSocket connections vanish without a close frame. Background jobs disappear.

AI tools never add shutdown handlers. Not once have I seen Cursor or Claude Code generate a SIGTERM handler unprompted.

The fix

const server = app.listen(PORT, () => {
  logger.info({ port: PORT }, "Server started");
});

async function shutdown(signal) {
  logger.info({ signal }, "Shutdown signal received");

  // Stop accepting new connections
  server.close(() => {
    logger.info("HTTP server closed");
  });

  // Close database connections
  try {
    await db.end();
    logger.info("Database connections closed");
  } catch (err) {
    logger.error({ err }, "Error closing database");
  }

  // Close Redis
  try {
    await redis.quit();
    logger.info("Redis connection closed");
  } catch (err) {
    logger.error({ err }, "Error closing Redis");
  }

  // Force exit after 10s if graceful shutdown stalls
  setTimeout(() => {
    logger.error("Forced exit after timeout");
    process.exit(1);
  }, 10000).unref();
}

process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));

The 10-second timeout is a safety net. If a database connection hangs during cleanup, you don't want the old process sitting there forever, blocking your deployment.

In Docker, make sure your app runs as PID 1 or use tini as an init system. Otherwise, SIGTERM never reaches your Node process:

# Dockerfile
FROM node:20-slim
RUN apt-get update && apt-get install -y tini
ENTRYPOINT ["tini", "--"]
CMD ["node", "server.js"]

08 Missing Database Migrations

The AI builds your schema by running CREATE TABLE statements directly, or it uses an ORM's sync({ force: true }) which drops and recreates tables on every restart. On localhost, that's fine because your test data is disposable. In production, that's data loss.

I've seen this pattern in Prisma projects where the AI puts npx prisma db push in the start script. That works for prototyping. It will silently drop columns in production if you rename a field in your schema.

The fix

Use proper migration files that version your schema changes. With Prisma:

# Generate a migration (does NOT apply it)
npx prisma migrate dev --name add_user_email_index

# Apply migrations in production
npx prisma migrate deploy

With raw SQL (using a tool like dbmate or golang-migrate):

# Create a migration
dbmate new add_orders_table

# This creates a file like:
# db/migrations/20260221000000_add_orders_table.sql

-- db/migrations/20260221000000_add_orders_table.sql

-- migrate:up
CREATE TABLE orders (
  id SERIAL PRIMARY KEY,
  user_id INTEGER NOT NULL REFERENCES users(id),
  total_cents INTEGER NOT NULL,
  status VARCHAR(20) DEFAULT 'pending',
  created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_orders_user_id ON orders(user_id);

-- migrate:down
DROP TABLE orders;

Every migration has an up and a down. They run in order. They're tracked in a migrations table in your database. You can roll back if something goes wrong.

Add migration execution to your deployment pipeline, before the new code starts serving traffic:

# In your deploy script or CI/CD pipeline
npx prisma migrate deploy  # or: dbmate up
node server.js

Never use sync({ force: true }) or db push in production. If your start script contains either of those commands, fix it now.

The Production Readiness Audit

Before your next deploy, run through this checklist. It takes 20 minutes and catches the issues that take 20 hours to debug in production:

Secrets: grep -rn "sk_\|password\|secret" --include="*.js" . returns zero hits in source code
Env validation: App crashes on startup if any required variable is missing
Error handling: Every route has try/catch, global error handler is registered
Health endpoint: GET /healthz returns 200 with dependency checks
Structured logging: Request logs include method, path, status, duration
No localhost: grep -rn "localhost" . only hits config files that read from env vars
Rate limiting: Auth endpoints limited to 5-10 requests per 15 minutes per IP
Graceful shutdown: SIGTERM handler closes connections, has a timeout
Migrations: Schema changes tracked in migration files, not sync commands
Process manager: App restarts on crash (PM2, systemd, Docker restart policy)

None of this is optional. Skip any of these and you'll spend a weekend debugging a production outage instead of building features.

What We Didn't Cover

This post focused on the failures we see most often in vibe-coded apps. The full production readiness picture extends further:

HTTPS and TLS configuration (covered in our deployment guide)
Security scanning with Snyk and Semgrep (also in the deployment guide)
Containerization best practices: multi-stage builds, non-root users, .dockerignore
CI/CD pipeline setup with automated testing before deploy
Monitoring and alerting: knowing when things break before your users tell you
Backup and disaster recovery for databases
Load testing to find your breaking point before real traffic does

Ship with Confidence

The Vibe Coding Deployment Guide covers all eight failures from this post plus 50+ additional checks. Includes Docker configs with CI/CD templates, a pre-deploy audit script, plus platform-specific setup for Railway and Fly.io (with VPS instructions too).

Get the deployment guide — $34 →

Stop Shipping Demos

Vibe coding is fast. You can go from idea to working prototype in an afternoon. But a prototype that runs on localhost is not production software. The gap between "it works on my machine" and "it runs reliably for paying users" is filled with the eight problems in this post.

The good news: every fix here is straightforward. An hour of work per issue, maybe less. Add env validation, error handling, health checks, logging, rate limiting, graceful shutdown, and proper migrations. Rip out every hardcoded URL. Run the audit checklist before every deploy.

Your AI tool wrote the feature code. The production infrastructure is on you.