Your agent framework says it can't reach Ollama. The model was running five minutes ago. Now it's "connection refused." Here are the nine causes, sorted by how often they're the actual problem, with a copy-paste fix for each.
Skip the Ollama connection debugging.
Connect your model by API or your own endpoint and let BetterClaw handle the plumbing. BYOK across 28+ providers, zero markup. Free forever, not a trial. Start free → No credit card · No Docker networking · No port binding
It was 11 PM. The agent had been running fine all day. Qwen 3.6 on Ollama, classifying emails, posting digests to Slack. Solid.
Then the Slack digest didn't arrive. I checked the agent logs. Error: fetch failed. Checked Ollama. Connection refused. Ran ollama ps. Nothing. The service had silently died because my laptop went to sleep for ten minutes and Ollama didn't recover on wake.
This is the most frustrating class of agent errors because everything looks right. The model is pulled. The config is correct. It was working an hour ago. And now... connection refused.
Here's every Ollama fetch failed and connection error variant, why it happens, and the copy-paste fix. Bookmark this page. You'll be back.
The 30-second diagnostic (run this first)

Before trying nine different fixes, run these three commands:
# 1. Is Ollama running?
ollama ps
# 2. Is the server listening?
curl http://localhost:11434/api/tags
# 3. Is a model loaded?
ollama list
- If
ollama psreturns nothing: Ollama isn't running. Go to Fix 1. - If
curlreturns "connection refused": The server process exists but isn't listening. Go to Fix 2 or Fix 3. - If
curlreturns a response but your agent can't connect: The URL your agent uses is wrong. Go to Fix 4 or Fix 5. - If everything looks fine but it's slow or timing out: Go to Fix 6 or Fix 7.
Fix 1: Ollama service isn't running (the #1 cause)
Symptoms: ollama ps returns nothing. curl localhost:11434 returns "connection refused." Your agent logs show "fetch failed" or "ECONNREFUSED."
Why it happens: Ollama crashed, was never started, or died after laptop sleep/restart. On macOS, the Ollama app might have quit. On Linux, the systemd service might have failed.
Copy-paste fix:
# macOS: Relaunch the Ollama app
# Or from terminal:
ollama serve &
# Linux: Restart the service
sudo systemctl restart ollama
sudo systemctl status ollama
# Verify it's running
curl http://localhost:11434/api/tags
Make it permanent (Linux):
# Enable auto-start on boot
sudo systemctl enable ollama
On macOS, add Ollama to Login Items (System Preferences → General → Login Items) so it starts automatically.
Fix 2: Ollama is running but bound to the wrong address
Symptoms: ollama serve shows it's running, but curl localhost:11434 returns "connection refused." Or it works on localhost but your agent (running in Docker) can't reach it.
Why it happens: By default, Ollama binds to 127.0.0.1:11434. This means only processes on the same machine can connect. If your agent runs in Docker, a VM, or a different machine, it can't reach 127.0.0.1.
Copy-paste fix:
# Linux: Set Ollama to listen on all interfaces
sudo systemctl edit ollama
# Add these lines:
[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
# Restart
sudo systemctl restart ollama
# macOS: Set the environment variable
launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
# Then restart Ollama
Verify: curl http://0.0.0.0:11434/api/tags should return your model list.
The single most common reason agents in Docker can't reach Ollama: the server is bound to 127.0.0.1 (localhost only). Agents in Docker need 0.0.0.0 or the host IP.
Fix 3: Port 11434 is blocked or in use
Symptoms: Ollama starts but immediately fails. Or another process is already on port 11434.
Copy-paste fix:
# Check what's using port 11434
lsof -i :11434 # macOS/Linux
netstat -tlnp | grep 11434 # Linux
# If another process is using it, kill it or change Ollama's port
OLLAMA_HOST=0.0.0.0:11435 ollama serve
If a firewall is blocking the port:
# Linux (ufw)
sudo ufw allow 11434/tcp
# macOS: Check System Preferences → Network → Firewall

Fix 4: Agent is using the wrong URL (Docker networking)
Symptoms: Ollama works fine in terminal (ollama run qwen3.6 works). But your agent framework (OpenClaw, Hermes, n8n) returns "fetch failed" or "connection refused."
Why it happens: Your agent is pointing to http://localhost:11434 but it's running inside Docker. Inside Docker, localhost means the container itself, not your host machine.
Copy-paste fix by framework:
OpenClaw/Hermes (running in Docker):
{
"baseUrl": "http://host.docker.internal:11434/v1"
}
host.docker.internal resolves to the host machine from inside a Docker container. Works on macOS and Windows Docker Desktop. On Linux, use:
# Linux Docker: use the host's IP
docker run --add-host=host.docker.internal:host-gateway ...
n8n (running in Docker): Set the Ollama URL to http://host.docker.internal:11434 in the n8n Ollama credential node. Our n8n vs Make comparison covers the full agent setup.
Any framework (running natively, not Docker): Use http://localhost:11434 or http://127.0.0.1:11434. If you changed Ollama's host binding (Fix 2), use http://0.0.0.0:11434.
Fix 5: HTTPS vs HTTP mismatch
Symptoms: "fetch failed" with an SSL/TLS error in the logs. Or the agent framework requires HTTPS but Ollama serves HTTP.
Why it happens: Ollama serves HTTP by default (no SSL). Some agent frameworks or reverse proxies expect HTTPS.
Copy-paste fix: Use http:// not https:// in your Ollama URL. If you need HTTPS (for remote access), put a reverse proxy (Nginx, Caddy) in front of Ollama:
# Caddy (simplest HTTPS reverse proxy)
# In Caddyfile:
ollama.yourdomain.com {
reverse_proxy localhost:11434
}
Fix 6: Model too large for available memory (silent OOM)
Symptoms: Ollama starts. The model begins loading. Then... silence. No response. Eventually: timeout. Or the model loads but inference is impossibly slow (under 1 tok/s).
Why it happens: The model's memory requirements exceed your available RAM. Ollama doesn't always error clearly. It just hangs or becomes unresponsive.
Copy-paste fix:
# Check what's loaded and how much memory it's using
ollama ps
# If the model is too large, use a smaller quant
ollama pull qwen3.6:35b-a3b # MoE, 3B active, fits in 8-16 GB
# Instead of:
# ollama pull qwen3.6:27b # Dense 27B, needs 24+ GB
See our Qwen 3.7 Ollama setup guide for the complete hardware requirements table by RAM tier.
Fix 7: Context window overflow causes timeout
Symptoms: First few messages work fine. Then after 5-10 messages, responses stop or time out. No error. Just... waiting.
Why it happens: The conversation grew beyond the model's allocated context window. Ollama tries to process all tokens and runs out of memory or slows to a crawl.
Copy-paste fix:
# Set a reasonable context window in your Modelfile
# Don't use the maximum, use what you need
cat > agent.modelfile << 'EOF'
FROM qwen3.6:35b-a3b
PARAMETER num_ctx 32768
PARAMETER num_predict 2048
EOF
ollama create agent -f agent.modelfile
Our context window management guide covers how to prevent context overflow and why the default settings are too small for agent tasks.

Fix 8: Ollama version mismatch (model not found)
Symptoms: ollama run modelname returns "model not found" even though you pulled it. Or the model exists in ollama list but won't load.
Copy-paste fix:
# Check your version
ollama --version
# Update Ollama
# macOS: The app auto-updates, or reinstall from ollama.com
# Linux:
curl -fsSL https://ollama.com/install.sh | sh
# Re-pull the model after updating
ollama pull qwen3.6:35b-a3b
Newer models sometimes require newer Ollama versions. If a model was released after your Ollama version, the GGUF format may be incompatible.
Fix 9: Hermes/OpenClaw specific connection issues
Symptoms: Hermes or OpenClaw can't connect to Ollama, even though curl and ollama run work fine from the terminal.
Why it happens: Hermes and OpenClaw use the OpenAI-compatible endpoint format. They need /v1 appended to the base URL. Or the env variable is set wrong.
Copy-paste fix for Hermes:
# In ~/.hermes/.env
OLLAMA_BASE_URL=http://localhost:11434
# Or if Hermes is in Docker:
OLLAMA_BASE_URL=http://host.docker.internal:11434
For more on Hermes connection errors, see our error 400 diagnostic guide.
Copy-paste fix for OpenClaw:
{
"models": {
"providers": {
"ollama": {
"baseUrl": "http://127.0.0.1:11434/v1"
}
}
}
}
Note the /v1 at the end. OpenClaw uses the OpenAI-compatible endpoint, not Ollama's native API.

If debugging Ollama connection issues, Docker networking, port binding, and framework-specific URL formats is more infrastructure work than you want to maintain, BetterClaw handles model connections at the platform level. Connect your API key via BYOK. No Ollama to manage. No Docker networking to debug. 28+ providers supported. Free plan with every feature. $19/month per agent on Pro.
The quick reference (screenshot this)

| Error | Most likely cause | Fix |
|---|---|---|
fetch failed | Ollama not running | ollama serve & or restart the app |
connection refused | Wrong bind address | Set OLLAMA_HOST=0.0.0.0:11434 |
connection refused (Docker) | localhost vs host.docker.internal | Use host.docker.internal:11434 |
| timeout after messages | Context window overflow | Set num_ctx 32768 in Modelfile |
| timeout on load | Model too large | Use smaller quant (Q4 instead of Q8) |
model not found | Outdated Ollama | Update and re-pull |
| SSL/TLS error | HTTPS vs HTTP | Use http:// not https:// |
| port in use | Another process on 11434 | lsof -i :11434 and kill or change port |
| Hermes/OpenClaw can't connect | Missing /v1 or wrong env var | Append /v1 to baseUrl |
The honest truth about running local models as agent backends: connection errors are part of the deal. Ollama is excellent software. But it runs on your machine, depends on your network configuration, and interacts with your Docker setup, your firewall, your sleep settings, and your memory limits. Every one of these is a potential failure point.
For agents that need to run reliably 24/7, the question isn't "can I fix this error?" It's "do I want to keep fixing these errors?"
Give BetterClaw a look if you'd rather build agents than debug connection strings. Free plan with 1 agent and every feature. $19/month per agent for Pro. 200+ verified skills. 28+ model providers via BYOK with zero markup. We handle the connections. You handle the agent logic.
Frequently Asked Questions
Why does Ollama say "fetch failed"?
"Fetch failed" means the client (your agent framework, browser, or CLI tool) couldn't establish a connection to Ollama's HTTP server. The three most common causes: Ollama service isn't running (Fix 1), the server is bound to localhost but the client is in Docker (Fix 4), or the port is blocked by a firewall (Fix 3). Run curl http://localhost:11434/api/tags to check if the server is responding.
How do I fix Ollama connection refused in Docker?
Docker containers can't reach localhost on the host machine. Replace http://localhost:11434 with http://host.docker.internal:11434 in your agent framework's config. On Linux Docker, add --add-host=host.docker.internal:host-gateway to your Docker run command. Also ensure Ollama is bound to 0.0.0.0:11434 (not just 127.0.0.1) by setting OLLAMA_HOST=0.0.0.0:11434.
Why does Ollama time out after a few messages?
Conversation context grows with each message. By message 10-15, the accumulated tokens may exceed Ollama's allocated context window, causing it to hang or become extremely slow. Fix: set PARAMETER num_ctx 32768 in your Modelfile to give the model enough context. Also set PARAMETER num_predict 2048 to cap output length. If the model is too large for your RAM, switch to a smaller quantization.
How do I connect Hermes or OpenClaw to Ollama?
For Hermes: set OLLAMA_BASE_URL=http://localhost:11434 in ~/.hermes/.env. For OpenClaw: set baseUrl to http://127.0.0.1:11434/v1 in your model provider config (note the /v1 suffix, required for OpenAI-compatible endpoints). If either runs in Docker, use host.docker.internal instead of localhost.
Should I use Ollama or a cloud API for agent backends?
Ollama is ideal for development, testing, privacy-sensitive workloads, and high-volume inference where API costs compound. Cloud APIs (via BYOK on platforms like BetterClaw) are better for production reliability (no connection errors, no sleep/wake issues, no port debugging), access to frontier models, and 24/7 uptime. Many teams use Ollama for development and cloud APIs for production.
Build agents, not connection strings.
BetterClaw handles model connections at the platform level — BYOK across 28+ providers, 200+ verified skills, zero markup. Free forever, not a trial. Start free →




