openai-codex Python SDK: Async, Streaming, and CI/CD Automation

Practical patterns for async, streaming, and headless auth using openai-codex 0.1.0b2 in CI/CD pipelines.

openai-codex Python SDK: Async, Streaming, and CI/CD Automation

a full response and issuing a corrective follow-up.

"Renamed AsyncTurn to AsyncTurnHandle to avoid collision with the generated Turn model. Canonical app-server-generated models are now exposed directly, replacing earlier custom wrapper types. Twenty-five example scripts added covering quickstart, multimodal input, streaming, turn controls, and retry patterns." — @shaqayeq-oai, PR #14446, reviewed by @owenlin0, merged March 17, 2026

import asyncio
from openai_codex import AsyncCodex

async def stream_with_mid_turn_control():
    async with AsyncCodex() as codex:
        thread = await codex.thread_start()

        # run_streaming() returns an AsyncTurnHandle — not a TurnResult
        handle = await thread.run_streaming(
            "Generate comprehensive unit tests for the payment module"
        )

        token_count = 0
        async for chunk in handle.stream():
            print(chunk, end="", flush=True)
            token_count += len(chunk.split())

            # Inject mid-turn guidance if output drifts too broad
            if token_count > 500:
                await handle.steer("Focus on edge cases for process_refund() only")

        # Await the completed TurnResult after streaming finishes
        result = await handle.result()
        return result.final_response

Streaming surfaces partial output as it arrives — attach progress callbacks for interactive tasks or long-running generation where a user is watching. For unattended batch jobs (nightly test-fix loops, scheduled documentation refreshes), the blocking thread.run() form is simpler and avoids async overhead. The decision is straightforward: use streaming when latency or interactivity matters; use the blocking form when you only need the final output.

The 25 example scripts shipped with the SDK repository cover quickstart, multimodal input, streaming, turn controls, and retry patterns. Check those before writing custom wrappers — back-pressure handling and error recovery are already implemented there. The .interrupt() method is particularly useful in token-budget-sensitive pipelines: if a turn exceeds an expected cost threshold, halt it and log the partial output rather than letting it run to completion.

Headless Authentication for CI Environments

Source: openai.com

The SDK supports four authentication modes, covering automatic credential reuse through to fully headless key injection . First-class auth support landed in Codex 0.132.0 on May 20, 2026 ; versions before 0.132.0 required manual credential pre-configuration via the Codex desktop app before any programmatic SDK use. For CI/CD, only one mode is viable: login_api_key(key).

import os
from openai_codex import Codex

# CI-safe headless auth: inject key from environment variable
codex = Codex()
codex.login_api_key(os.environ["CODEX_API_KEY"])

thread = codex.thread_start()
result = thread.run("Review the diff in CHANGED_FILES and flag any security regressions")
print(result.final_response)

In GitHub Actions, store the key as a repository secret (CODEX_API_KEY) and expose it to the job via the env: block. Never hardcode it in workflow YAML or source files. The login_chatgpt() mode returns a browser auth_url for redirect; login_chatgpt_device_code() returns a verification_url and user_code for manual entry — both require a human in the loop and cannot be automated in headless runner environments. Credential reuse works on a developer workstation where an existing Codex session is already present, but is unreliable in ephemeral CI containers.

openai-codex authentication modes compared
Auth mode Flow type Headless-compatible Token lifecycle Best for
Credential reuse Automatic (existing session) Conditional (requires pre-auth) Matches existing Codex session Developer workstation scripts
login_chatgpt() Browser redirect (returns auth_url) No Short-lived browser session token Interactive local use
login_chatgpt_device_code() Device code (verification_url + user_code) No Short-lived device-code token Browserless devices, manual flows
login_api_key(key) Direct key injection Yes Controlled by key issuer; revocable CI/CD, automated pipelines

CI/CD Integration Patterns and Dependency Pinning

Codex app showing a browser comment on a local web app preview
Source: openai.com

Two patterns dominate SDK-based CI integration: PR review automation and approval-gated auto-fix workflows. The PR review pattern creates a Codex thread on PR open, runs a review prompt against the diff, and posts TurnResult.final_response as a comment via the GitHub API. The approval-gated pattern routes proposed file changes through a human gate before applying them to the branch, preventing automated pipelines from writing to protected branches without explicit review.

A minimal PR review step using GitHub Actions:

# .github/workflows/codex-review.yml
name: Codex PR Review
on: [pull_request]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: "3.12" }
      - run: pip install openai-codex==0.1.0b2
      - name: Run Codex review
        env:
          CODEX_API_KEY: ${{ secrets.CODEX_API_KEY }}
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: python scripts/pr_review.py
# scripts/pr_review.py
import os
import subprocess
from openai_codex import Codex

diff = subprocess.check_output(["git", "diff", "origin/main...HEAD"], text=True)

codex = Codex()
codex.login_api_key(os.environ["CODEX_API_KEY"])

thread = codex.thread_start()
result = thread.run(
    f"Review this pull request diff for correctness, security, and style.\n\n{diff}"
)

# Post result as a PR comment via gh CLI
subprocess.run(["gh", "pr", "comment", "--body", result.final_response], check=True)

For the approval-gated workflow, enable approval mode when creating the thread. Codex proposes file changes and surfaces them as pending; your CI step logs the proposals for the PR author and waits for an explicit approval action before changes are written. This is the correct pattern for any automated agent that touches production branches.

Version pinning is non-negotiable for CI stability. The Python SDK's 0.1.0 semver track is independent of the Rust runtime's rolling 0.13x.x numbering — these are two separate version axes, and conflating them will cause unexpected behavior on runtime upgrades . Pin openai-codex==0.1.0b2 explicitly in requirements.txt or pyproject.toml. The companion openai-codex-cli-bin binary must be co-pinned to the same SDK version via PEP 440 mapping — this co-pinning mechanism was introduced in PR #18996, merged April 27, 2026 by @sdcoffey . Install both at matching versions or omit the binary entirely; mismatched installs cause runtime errors at import time.

One naming hazard worth flagging explicitly: a third-party package called openai-codex-sdk exists on PyPI (maintained by @tomasroda, versions 0.1.0–0.1.11, published December 2025–January 2026) . It wraps the Codex binary differently and is not the official package. The correct package name is openai-codex.

Frequently Asked Questions

Source: openai.com

What Python version does openai-codex require?

Python 3.10 or higher is required. Install with pip install openai-codex. There are no heavy transitive dependencies beyond the SDK package itself — the package is designed to be lightweight and suitable for CI environments where dependency footprint matters. As of the May 28, 2026 beta releases, no optional extras are needed for core functionality including threads, streaming, and headless authentication .

What is the difference between Codex and AsyncCodex?

Codex is synchronous and blocking: thread.run() blocks the calling thread until the agent returns a TurnResult. AsyncCodex returns coroutines at every step and integrates with asyncio. Use AsyncCodex when you need concurrent turn routing via asyncio.gather, streaming callbacks via AsyncTurnHandle, or non-blocking pipeline steps. For simple scripts where concurrency is not needed, Codex requires less code and avoids event loop setup overhead. Both expose the same thread_start() and thread.run() API surface — the choice is purely about execution context.

What does AsyncTurnHandle let you do that TurnResult does not?

AsyncTurnHandle (renamed from AsyncTurn in PR #14446, merged March 17, 2026 ) is the handle for a turn that is still executing. It exposes .steer() for mid-turn guidance injection and .interrupt() to halt execution before the turn completes. TurnResult is the completed output returned after the turn finishes — it has no control methods. To access the handle, call thread.run_streaming() instead of the blocking thread.run(); the handle is live only while the turn is in progress.

How do I authenticate openai-codex in a GitHub Actions workflow without a browser?

Use codex.login_api_key(key) with the key injected via a GitHub Actions secret. Store the key as a repository secret (e.g., CODEX_API_KEY), expose it via the job's env: block, and read it with os.environ["CODEX_API_KEY"]. The browser-based flows — login_chatgpt() returning an auth URL and login_chatgpt_device_code() returning a verification URL and user code — both require human interaction and cannot be automated in headless runner environments. First-class auth support including login_api_key() landed in Codex 0.132.0 on May 20, 2026 ; versions before 0.132.0 required a pre-existing authenticated Codex session.

How does openai-codex 0.1.0 versioning relate to the Codex runtime version?

The Python SDK's 0.1.0 semver track is independent of the Rust runtime's rolling 0.13x.x numbering. These are separate version axes: the SDK tracks its own API stability, while the runtime tracks its own release cadence. Pin the SDK explicitly (openai-codex==0.1.0b2) and do not infer the runtime version from the SDK version or vice versa. The companion openai-codex-cli-bin binary must be co-pinned to the same SDK version via PEP 440 mapping, a mechanism introduced in PR #18996, merged April 27, 2026 . A version mismatch between the SDK and the binary causes runtime errors.

What to Build Next

The SDK's primary value is replacing ad-hoc subprocess wrappers with a typed, maintainable Python interface. The three patterns worth starting with — in order of increasing complexity — are a PR review bot posting TurnResult.final_response as a GitHub comment; a concurrent file reviewer using asyncio.gather across multiple AsyncCodex threads; and an approval-gated auto-fix workflow routing proposed changes through a human gate before application. Each pattern builds directly on the previous one, and thread state persistence makes iterative review-and-fix loops manageable without custom context tracking.

The Experimental label on the official SDK documentation and the b PEP 440 classifier on both May 28, 2026 releases signal that breaking changes are possible before a stable 1.0. Read the Codex changelog before upgrading, pin your SDK version, and check the 25 example scripts in the repository before writing custom wrappers. For multi-agent orchestration, the MCP server path (codex mcp-server with MCPServerStdio integration) exposes codex and codex-reply tools for session management and is documented separately in the Agents SDK guide .

Track new SDK and runtime releases at the openai/codex releases page . The two version tracks move independently, and new auth modes or turn control primitives may land at either layer without a corresponding bump on the other.

Last updated: 2026-05-28. Article based on openai-codex SDK releases python-v0.1.0b1 and python-v0.1.0b2 published May 28, 2026, and Codex runtime changelog through version 0.132.0.

Stay in the loop

Field notes on AI tooling, agents, and the protocols connecting them.

Explore Creeta