AI-powered apps break traditional hosting assumptions. A standard WordPress page request takes 50–200ms of PHP execution. An AI-enhanced page — one that calls OpenAI, Anthropic, or a local model — takes 800ms to 8 seconds. Most shared hosting kills those requests before they complete.
I've been deploying AI-powered apps for clients since early 2025. Here's what I've learned about hosting infrastructure for this new category of application.
What Makes AI Apps Different
Long-running requests. Traditional hosting assumes requests complete in under 30 seconds. AI inference calls can take longer — especially for complex prompts or streaming responses. Many shared hosts have PHP execution time limits of 30–60 seconds that will kill your AI requests.
Memory spikes. If you're running a local model (Llama 3, Mistral) or even just loading a large embeddings library, you'll need 2–8GB RAM per process. That's enterprise-tier on shared hosting. It's standard on a $20/mo VPS.
Cold start latency. Serverless functions (like Vercel's edge functions) have cold starts. For AI apps, a cold start + AI inference time = a very unhappy user. You want persistent Node.js or Python processes.
Outbound API calls. Your server needs to make HTTPS requests to OpenAI/Anthropic/etc. Many cheap hosts firewall outbound connections. Always test this before committing to a host.
Tested Configurations
I tested three hosting configurations with an identical AI-powered app: a WordPress site with an integrated GPT-4o chatbot widget, using streaming responses.
Test app: WordPress + WooCommerce + custom chatbot plugin calling GPT-4o API (streaming). Metric: Time from user message to first streamed token appearing on screen. Secondary metric: Whether the host timed out 8-second AI requests under load.
The Results
Cloudways (DigitalOcean 2GB) — 98ms to first token
Cloudways on their $14/mo DigitalOcean 2GB plan handled our AI app without any configuration changes. PHP 8.3, no execution time limits by default, outbound HTTP works out of the box.
The 2GB RAM plan was sufficient for our chatbot workload. We never hit memory limits during testing. For heavier workloads (local model inference, large vector search), upgrade to the 4GB plan at $22/mo.
Verdict: Best value for AI-enhanced WordPress. Easy setup, managed environment, no server expertise required.
Kinsta — 91ms to first token
Kinsta's Google Cloud infrastructure handled our AI workload marginally faster than Cloudways — 91ms vs 98ms to first token. Their C2 compute-optimized instances are genuinely better for CPU-intensive work.
The downside: Kinsta's PHP execution limit requires a support ticket to raise. It's not a blocker, but it's friction. Their staging environments made testing different AI configurations fast.
Verdict: Best performance for AI-enhanced WordPress. Worth the premium for client work.
Hetzner VPS (CX22, €3.79/mo) — 108ms to first token
Hetzner is the surprise. Their €3.79/mo CX22 VPS (2 vCPU, 4GB RAM, Nuremberg datacenter) outperforms $30/mo managed hosting options on raw performance.
The catch: zero management. You're configuring Nginx, PHP-FPM, SSL, and firewall rules yourself. For developers comfortable with server management, it's the best performance-per-euro option in Europe.
Verdict: Best for developers who want to self-host AI apps at minimal cost.
What About Serverless / Edge?
Vercel's edge functions are excellent for AI streaming. The streaming API support is best-in-class. But edge functions have a 25MB size limit — which means you can't run local models, can't include large libraries, and can't do anything that needs persistent server state.
For AI apps that only call external APIs (OpenAI, Anthropic), Vercel works well. For anything more complex — local models, database connections, file processing — you need a proper server.
Hosting Recommendations by App Type
| App Type | Recommended Host | Why |
|---|---|---|
| AI chatbot on WordPress | Cloudways | Easy setup, good PHP limits, managed |
| High-performance AI SaaS | Kinsta | Google Cloud infrastructure |
| Self-hosted LLM (Ollama) | Hetzner VPS | High RAM, affordable GPU options |
| Next.js AI app | Vercel + Cloudways | Edge for frontend, Cloudways for API |
| Python FastAPI + AI | Cloudways or Hetzner | Node/Python support, persistent processes |
The One Thing That Kills AI Apps
Execution time limits. Before deploying any AI app, test this:
curl -o /dev/null -s -w "%{time_total}" -X POST https://yoursite.com/wp-json/ai/chat \
-H "Content-Type: application/json" \
-d '{"message": "Write me a 500-word essay about hosting"}'
If the response takes longer than your host's execution limit, users get an error. Check your host's max_execution_time PHP setting. Cloudways defaults to 300 seconds. Many shared hosts default to 30.
That single configuration difference is worth more than any other hosting feature for AI apps.