Codex Weekly: Security Scanner, Pentagon Fallout, GPT-5.4

Three stories defined the OpenAI week ending March 15: the launch of Codex Security — a vulnerability scanner claiming a 50%+ false-positive reduction — GPT-5.4 hitting the public API with a 1M-token context window and built-in computer use, and a leadership departure over OpenAI’s Pentagon contract that’s reverberating across the industry. Also notable: the acquisition of Promptfoo, used by a quarter of the Fortune 500 for LLM red-teaming, and a new Open Source Support Program that names third-party Codex interfaces by name.

Codex Security Launches in Research Preview

Codex Security went live March 6, making OpenAI a direct competitor to established SAST and AI-assisted security tools. The agent analyzes repositories, generates editable threat models, sandboxes candidate vulnerabilities to validate proof-of-concept exploits, and proposes code fixes inline.

Beta numbers are credible: 1.2M commits scanned, 792 critical and 10,561 high-severity issues surfaced, with critical issues appearing in fewer than 0.1% of commits. The claim of more than 50% fewer false positives than conventional tools is the headline figure — false positive fatigue is the primary reason security tooling gets ignored in developer workflows.

The tool is currently free for the first month for ChatGPT Pro, Enterprise, Business, and Edu customers. It evolved from OpenAI’s internal “Aardvark” scanner, and OpenAI has signaled it will expand to Enterprise and Edu plans broadly in the coming weeks. Teams evaluating it should run it against a known-issue branch before deploying to CI — benchmark against your existing toolchain rather than taking the beta metrics at face value.

GPT-5.4 and the API Upgrade Cycle

GPT-5.4 and GPT-5.4 Pro landed in the Chat Completions and Responses API on March 5 with several capabilities that matter for agentic workloads:

1M-token context window with native Compaction support — useful for long-running agent sessions that currently require manual context management
Computer use via the Responses API computer tool: screenshot-based UI interaction scoring 75.0% on OSWorld-Verified, without custom integrations
Tool search: models defer large tool-surface definitions until runtime, reducing token usage and improving cache hit rates for pipelines with many tools

Pricing: GPT-5.4 at $2.50/$15 per million input/output tokens; GPT-5.4 Pro at $30/$180. The Pro tier is positioned explicitly for compute-intensive agentic use cases — at that price point, teams should model expected token volume before defaulting to it across all workloads.

A minor image encoder bug affecting input_image inputs was corrected on March 13 (no developer action required).

GPT-5.1 was retired March 11. Users auto-migrated to GPT-5.3 or GPT-5.4 snapshots; developer forums logged behavioral regressions in some production workflows. If you’re seeing unexpected output changes, check your model alias and test against the specific snapshot version before escalating.

Promptfoo Acquisition: AI Security Gets a Platform

OpenAI acquired Promptfoo on March 9 — an open-source CLI and library used by 25%+ of Fortune 500 companies and 130,000 monthly active developers for LLM red-teaming, prompt injection testing, jailbreak detection, and compliance monitoring. Promptfoo had raised $23M at an $86M valuation.

The tools will integrate into OpenAI Frontier, the company’s platform for building and operating AI agents. The open-source project continues independently.

This is a significant enterprise signal. Organizations using Promptfoo today are unlikely to lose continuity — but their evaluation tooling is now vendor-affiliated. Teams building compliance workflows around Promptfoo should track how integration into Frontier changes the roadmap and whether the open-source version remains a first-class product.

Codex CLI: 0.115.0 Alpha Surge

The openai/codex GitHub repo shipped alpha.15 through alpha.24 of the 0.115.0 series across March 13–14 alone — a pace that signals an imminent stable release. Community-reported highlights from the alpha series include:

A hooks engine (SessionStart/Stop events) for automation workflows
An experimental code mode for direct file editing
Revamped automations with local and worktree execution contexts
Custom reasoning-level settings and theming options
Terminal reading for thread and build status

The previous stable 0.113.0 introduced a declarative permission-profile language for filesystem and network sandbox policy management — a meaningful addition for enterprise deployments where precise sandbox scoping matters. The plugin marketplace also shipped with install-time auth, plugin/uninstall, and @plugin chat mention support in this cycle.

Open Source Support Program

OpenAI launched an Open Source Support Program on March 7, offering six months of free ChatGPT Pro access, Codex code-generation capabilities, data-analysis tools, and API credits to maintainers of GitHub-hosted projects with 1,000+ stars.

The program explicitly names OpenCode, Cline, and OpenClaw as qualifying third-party Codex interfaces — a direct acknowledgment of the ecosystem that has grown around OpenAI’s models. Projects deemed ecosystem-critical below the star threshold are also eligible.

For consulting teams building on the Codex platform, this is a useful signal: OpenAI is investing in the health of the third-party tooling ecosystem, not just first-party surfaces.

ChatGPT: Interactive Visuals and Model Updates

Interactive visual learning launched March 10 for all logged-in users — dynamic, manipulable diagrams for 70+ math and science topics. More enterprise-relevant: GPT-5.4 Thinking is now available in ChatGPT with an editable upfront thinking plan and stronger tool support for spreadsheets, presentations, and documents.

GPT-5.3 Instant improvements (March 3) improved web search accuracy, contextual relevance, and reduced dead ends in the gpt-5.3-chat-latest model, now available on both Chat Completions and Responses API.

The Pentagon Deal and Kalinowski Departure

OpenAI’s head of robotics and hardware, Caitlin Kalinowski, resigned March 14–15 over the company’s Department of Defense contract, citing inadequate guardrails against domestic surveillance without judicial oversight and lethal autonomy without human authorization. Altman acknowledged the announcement was “rushed” and OpenAI added explicit red lines: no mass domestic surveillance, no directing autonomous weapons, no high-stakes fully automated decisions.

The backdrop: the Pentagon had previously designated Anthropic a national security supply chain risk after Anthropic refused “any lawful use” contract terms (see Claude Weekly 2026-03-11). OpenAI moved to fill that gap — and the subsequent backlash cost a senior leader.

For enterprise customers, the near-term effect is limited. The longer-term question — how OpenAI’s military commitments shape its safety posture and talent — is worth tracking, particularly for organizations with their own responsible AI policies.

Enterprise Notes

Microsoft Entra scope updates (March 11) require admin action: Entra admins must review and approve expanded permission scopes for Outlook Calendar, Email, SharePoint, and Teams integrations. Workspace admins must also enable new actions in Workspace Settings > Apps > Manage Actions to prevent connection issues for new users.
SCIM group support for workspace analytics now allows Enterprise workspaces to segment Codex task insights by team or department.
Sora 2 API expanded March 12 with character references, 20-second generations, 1080p output for sora-2-pro at $0.70/second, and a new POST /v1/videos/edits endpoint. The older POST /v1/videos/{video_id}/remix is deprecated with a 6-month sunset window.
Auto top-up for shared Codex/Sora credits is now available for eligible Plus and Pro users — set a minimum balance threshold to avoid workflow interruptions.

What to Watch

Codex CLI 0.115.0 stable release. With 24+ alpha builds in a single week, a stable tag is imminent. The hooks engine and code mode will be the features to evaluate for enterprise automation workflows. Watch the releases page.
Promptfoo-to-Frontier integration roadmap. Once the acquisition closes, watch for how red-teaming and evaluation tooling surfaces in Frontier’s developer UX — this will define the enterprise AI security evaluation story for OpenAI’s platform.
Codex Security expansion to Enterprise/Edu. OpenAI has signaled a rollout “in coming weeks.” Pricing post-free-month and audit trail capabilities will be the evaluation criteria for enterprise security teams.
Pentagon contract guardrail formalization. OpenAI’s stated red lines remain informal commitments. Watch for formal policy documents and any additional leadership changes as the company works to define the boundaries of its DoD relationship.

Check back next week for the latest from the OpenAI and Codex ecosystem. If you’re evaluating how these changes affect your enterprise AI strategy, Big Hat Group can help.