Local llama.cpp, Ollama, and vLLM tool-calling compatibility
Understand why local-model servers can chat normally but still fail on agent tool calling, tool-result continuation, or OpenAI-compatible multi-turn behavior in OpenClaw.
Symptoms
- Local chat works, but agent tool calling is flaky or broken.
- The first message may work, but later turns fail after a tool runs.
- You may see template errors such as
Unexpected message role. - The model may emit raw tool JSON into
contentinstead of participating in a clean tool-calling loop.
Cause
With local-model servers, there are usually three separate compatibility layers:
- the model itself,
- the server API layer,
- the OpenAI-compatible wrapper or chat template.
Those layers are often treated as one thing by users, but they fail differently.
Examples:
llama.cppmay reject tool-related message roles because the chat template does not understand them.- Ollama native API and Ollama
/v1compatibility mode do not have the same expectations or stability for tools. - vLLM may support basic OpenAI-compatible chat but still differ on tool-calling details and later-turn behavior.
Fix
1) Separate selection problems from runtime problems
First prove OpenClaw is actually using the local model you expect:
openclaw models status --probe
If this resolves to the wrong provider, fix selection/config first.
If it resolves to the correct local provider but real runs still fail, you are now debugging tool-calling compatibility.
2) Ask whether the failure is at the model, server, or wrapper layer
Use this mental split:
- model problem: the model never produces reliable tool calls,
- server problem: the backend cannot represent tool turns or later tool results correctly,
- wrapper/template problem: the server tries to map messages into a chat template that rejects tool roles.
That distinction prevents a lot of blind retrying.
3) Prefer the most native path when you need tool reliability
If a local stack offers both:
- a native server API, and
- an OpenAI-compatible
/v1path,
assume the native path is the stronger default for advanced agent behavior unless you have already proven the /v1 path works for multi-turn tool use.
4) If the server rejects tool-related roles, do not treat it as a generic auth issue
Template or tool-role failures are usually not caused by API keys, provider selection, or baseUrl typos.
They are evidence that the compatibility layer cannot represent OpenClaw’s full conversation state correctly.
Verify
You have reached the right diagnosis if:
- plain chat works,
- failures appear specifically when tool turns enter the session,
- and the problem correlates with server/template behavior rather than with auth or networking.
Related
- Ollama model-selection and
/v1boundary guide: /troubleshooting/solutions/ollama-configured-but-falls-back-to-anthropic - Relay/runtime payload mismatch guide: /guides/openclaw-relay-and-api-proxy-troubleshooting
- If your provider works only in minimal tests: /troubleshooting/solutions/api-works-in-curl-but-openclaw-fails