OpenClaw + OpenAI: How to Use GPT Models with Your Agents

Ronak Kadhi

March 23, 20269 min read

OpenClaw defaults to Anthropic's Claude models, but it works with OpenAI's GPT models too. If you already have an OpenAI API key, or if a specific task works better with GPT-4o or o3, swapping models takes about 30 seconds.

This guide covers the full setup — API keys, model selection, config files, cost comparison, and when each provider actually makes sense.

Why Use OpenAI Models with OpenClaw?

Claude is the default for good reason — OpenClaw is built by Anthropic, and the agentic loop is optimized for Claude's tool-use patterns. But there are legitimate reasons to reach for OpenAI:

GPT-4o has strong vision capabilities and faster response times for certain tasks
o1 and o3 reasoning models excel at math-heavy and logic-intensive problems
Cost arbitrage — for simple tasks, GPT-4o-mini can be significantly cheaper than Claude Sonnet
Existing billing — if your company already has an OpenAI Enterprise agreement

OpenClaw's agent framework is model-agnostic. The harness (tools, file access, shell commands) stays the same regardless of which model is doing the thinking.

Setting Up Your OpenAI API Key

First, grab your API key from platform.openai.com. Then set it as an environment variable:

# Add to your shell profile (~/.zshrc or ~/.bashrc)
export OPENAI_API_KEY=sk-proj-your-key-here

# Reload your shell
source ~/.zshrc

Verify it's set:

echo $OPENAI_API_KEY
# Should print your key

You can also set it per-session if you don't want it permanently in your profile. Just export it before launching OpenClaw.

Get Your Free Marketing Audit

AI agents analyze your site for SEO, CRO, and content issues — full report in 2 minutes.

Audit My Site Free →

Configuring OpenClaw to Use GPT Models

OpenClaw reads model configuration from its settings file. You can set this globally or per-project.

Global Configuration

Edit your global OpenClaw config:

# Open the settings file
openclaw config set model openai:gpt-4o

Or edit ~/.openclaw/config.json directly:

{
  "model": "openai:gpt-4o",
  "apiKeys": {
    "openai": "sk-proj-your-key-here"
  }
}

Per-Project Configuration

Create a .openclaw/config.json in your project root:

{
  "model": "openai:gpt-4o"
}

Project-level config overrides global config. This is useful when different repos need different models.

Switching Models on the Fly

You can also specify the model per-command without changing any config:

# Use GPT-4o for a single task
openclaw --model openai:gpt-4o "analyze this CSV and create a summary report"

# Use o3 for a reasoning-heavy task
openclaw --model openai:o3 "find the bug in this algorithm and prove the fix is correct"

# Use GPT-4o-mini for a simple task
openclaw --model openai:gpt-4o-mini "rename all files in this directory to kebab-case"

Available OpenAI Models

Here's what works with OpenClaw and when to use each:

| Model | Best For | Input Cost (per 1M tokens) | Output Cost (per 1M tokens) | Tool Use | | ------- | ---------- | --------------------------- | ---------------------------- | ---------- | | gpt-4o | General tasks, vision, fast responses | $2.50 | $10.00 | Excellent | | gpt-4o-mini | Simple tasks, high volume | $0.15 | $0.60 | Good | | o1 | Complex reasoning, math | $15.00 | $60.00 | Limited | | o3 | Advanced reasoning, research | $10.00 | $40.00 | Good | | o3-mini | Reasoning on a budget | $1.10 | $4.40 | Good |

For comparison, Claude Sonnet 4 costs $3/$15 per million tokens (input/output). Claude Haiku runs $0.25/$1.25.

When to Use OpenAI vs Anthropic

After running hundreds of agentic tasks across both providers, here's where each shines:

Use Claude (default) when:

Writing or editing code — Claude's code generation is consistently more careful with existing codebases
Multi-step file operations — Claude follows OpenClaw's tool-use patterns more reliably
Long-context tasks — Claude's 200K context window handles large codebases better
You need the agent to ask clarifying questions instead of guessing

Use GPT-4o when:

Vision-heavy tasks — analyzing screenshots, diagrams, or UI mockups
Quick one-shot tasks where speed matters more than depth
You need web browsing with strong summarization
Batch processing simple, repetitive operations

Use o3 when:

Mathematical proofs or algorithm design
Tasks that require multi-step logical reasoning
Code review where you want the model to think deeply about edge cases
Research synthesis across multiple documents

Practical Examples

Using GPT-4o for Screenshot Analysis

openclaw --model openai:gpt-4o "look at the screenshot in ./bug-report.png and identify what UI element is broken, then fix the CSS"

GPT-4o's vision model parses the screenshot, identifies the issue, and OpenClaw's file tools make the fix.

Using o3 for Algorithm Optimization

openclaw --model openai:o3 "the sorting function in src/utils/sort.js is O(n²) — refactor it to O(n log n) and add benchmarks proving the improvement"

The o3 reasoning model takes longer but produces more thorough algorithmic analysis.

Using GPT-4o-mini for Bulk Operations

openclaw --model openai:gpt-4o-mini "add JSDoc comments to every exported function in the src/lib/ directory"

For repetitive, well-defined tasks, mini models save significant cost without sacrificing quality.

Cost Comparison: A Real Workday

Here's what a typical day of agent usage costs across providers, assuming ~50 tasks:

| Task Type | Count | Claude Sonnet | GPT-4o | GPT-4o-mini | | ----------- | ------- | -------------- | -------- | ------------- | | Code edits | 20 | $4.80 | $3.20 | $0.48 | | File analysis | 15 | $2.70 | $1.80 | $0.27 | | Research/browsing | 10 | $3.60 | $2.40 | $0.36 | | Complex reasoning | 5 | $4.50 | $3.00 | $0.45 | | Daily total | 50 | $15.60 | $10.40 | $1.56 |

The cheapest option isn't always the best. GPT-4o-mini will fumble complex tasks that Claude Sonnet handles cleanly. Match the model to the task complexity.

Multi-Model Workflows

The power move is using different models for different parts of your workflow. Some teams set up OpenClaw skills that specify the model per-skill:

# .openclaw/skills/quick-fix.md
---
model: openai:gpt-4o-mini
---
Fix the specified bug with minimal changes. Run tests after.

# .openclaw/skills/deep-review.md
---
model: openai:o3
---
Review the code changes for logical errors, edge cases, and security issues.

This way, cheap models handle routine work while expensive models focus on tasks that justify the cost.

Troubleshooting

"Invalid API key" error: Make sure you're using a project key (starts with sk-proj-) not a legacy key. Check that the environment variable is exported, not just set.

Model not found: Use the full provider prefix — openai:gpt-4o, not just gpt-4o. OpenClaw needs to know which API endpoint to hit.

Rate limits: OpenAI's rate limits are per-organization. If you hit them, either upgrade your usage tier on platform.openai.com or add request delays in your OpenClaw config.

Tool use failures with o1: The o1 model has limited tool-use support. If your task requires heavy file operations or shell commands, stick with gpt-4o or Claude.

For teams managing multiple agents with different model configurations, RunAgents lets you assign models per-agent from a dashboard — no config files to juggle.

Frequently Asked Questions

Can I use OpenAI and Anthropic models in the same OpenClaw session?

Not within a single session, but you can switch between sessions. Each OpenClaw invocation uses one model. Use the --model flag to pick per-task, or set up skills with different model defaults.

Does OpenClaw work with the OpenAI Assistants API?

No. OpenClaw uses OpenAI's Chat Completions API directly. It manages its own agentic loop, tool calls, and context — it doesn't rely on OpenAI's assistant or thread abstractions.

Is function calling different between Claude and GPT models in OpenClaw?

OpenClaw abstracts this away. Its tool-use system translates to the correct format for each provider. You write the same commands regardless of which model is running underneath.

Will my OpenClaw skills work with OpenAI models?

Yes. Skills are model-agnostic instructions. The same skill file works with Claude, GPT-4o, or any other supported model. Performance may vary — some skills work better with certain models.

What happens to my conversation history when I switch models?

Each OpenClaw session is independent. Switching models between sessions doesn't carry over context. If you need continuity, use OpenClaw's session resume feature within the same model.

Is there a cost dashboard to track spending across providers?

OpenClaw itself doesn't have one, but ClawMetry provides local token and cost tracking. For team-wide visibility across multiple agents and providers, RunAgents includes per-agent cost dashboards.

Want multi-model agents managed from one dashboard? RunAgents gives you managed OpenClaw hosting with task management, team collaboration, and agent debugging built in. Get started free

Related Guides

What Is OpenClaw? -- Framework overview
OpenClaw Skills Guide -- Extend your agents
Run OpenClaw in Docker -- Containerized setup

Get Your Free Marketing Audit

Our AI agents analyze your site and surface every SEO, CRO, and content problem — with prioritized fixes. Full report in 2 minutes.

Audit My Site Free →

No credit card required

OpenClaw + OpenAI: How to Use GPT Models with Your Agents

Why Use OpenAI Models with OpenClaw?

Setting Up Your OpenAI API Key

Configuring OpenClaw to Use GPT Models

Global Configuration

Per-Project Configuration

Switching Models on the Fly

Available OpenAI Models

When to Use OpenAI vs Anthropic

Practical Examples

Using GPT-4o for Screenshot Analysis

Using o3 for Algorithm Optimization

Using GPT-4o-mini for Bulk Operations

Cost Comparison: A Real Workday

Multi-Model Workflows

Troubleshooting

Frequently Asked Questions

Can I use OpenAI and Anthropic models in the same OpenClaw session?

Does OpenClaw work with the OpenAI Assistants API?

Is function calling different between Claude and GPT models in OpenClaw?

Will my OpenClaw skills work with OpenAI models?

What happens to my conversation history when I switch models?

Is there a cost dashboard to track spending across providers?

Related Guides

Get Your Free Marketing Audit

Keep reading

OpenClaw Dashboard: How to Monitor and Manage Your AI Agents

AI Writing Tools: The Definitive 2026 Guide (By Use Case, Not Hype)

SEO Automation Tools: From Dashboards That Show Data to Agents That Do the Work