Company #openai #codex #dell #enterprise

OpenAI + Dell: Codex On-Premises Architecture for Enterprise

OpenAI named Dell as its first non-hyperscaler Codex deployment path. Here's how the architecture actually works and who it targets.

Creeta

May 29, 2026

OpenAI + Dell: Codex On-Premises Architecture for Enterprise

What the OpenAI–Dell Partnership Actually Adds

The OpenAI–Dell arrangement announced on May 18, 2026 at Dell Technologies World in Las Vegas is the first named, non-Azure, non-hyperscaler deployment path for Codex . Every prior OpenAI enterprise channel — Stargate, Foundry, Deployment Co — ran on cloud-side infrastructure. This one is explicitly architected for customer-owned datacenters, covering Codex, ChatGPT Enterprise, and OpenAI API products through Dell's deployment frameworks. The scope matters because it reflects a deliberate shift: OpenAI is now building a channel for organizations that have legal, contractual, or policy reasons to keep data on-premises.

Quick Answer: OpenAI and Dell announced on May 18, 2026 the first non-Azure enterprise path for Codex — running on customer-owned PowerEdge hardware via Dell's AI Data Platform. Data stays on-premises throughout agent sessions. No GA date or pricing has been published; this is a partnership intent, not a shipping product.

All prior OpenAI enterprise distribution ran through Microsoft's Azure infrastructure (Foundry, private endpoints, Deployment Co) or through OpenAI's own cloud (Stargate). The Dell path is structurally different: requests from developer IDEs route through on-premises endpoints to Dell-hosted Codex instances. Source code, internal schemas, and operational data never traverse a hyperscaler network. That is a functional difference, not a marketing framing.

The partnership scope covers three product lines: Codex (the agentic workspace), ChatGPT Enterprise, and the OpenAI API surface. Dell's contribution is its AI Factory hardware stack and AI Data Platform — specifically the Starburst federated query integration and the PowerEdge server lineup that provides inference compute. The stated intent is hybrid and on-premises coverage, meaning organizations can run workloads locally while maintaining optional connectivity to OpenAI cloud services where policy permits .

"The Dell AI Factory with OpenAI Codex will allow enterprises to deploy AI where enterprise data already lives, within their premises, giving customers a practical, secure path to deploying AI agents at scale," — Ihab Tarazi, SVP and CTO of Infrastructure Solutions Group at Dell Technologies (source: ChannelLife AU, 2026-05)

The announcement positions data locality as the primary unlock, not performance or cost reduction. For organizations that have been categorically excluded from Codex-class agent tooling because of cloud egress constraints, this represents the first official architectural path forward.

Codex's Current Capability Surface: What the Agent Does Now

Codex launched as a coding tool, but as of April 16, 2026, it operates as a general-purpose agentic workspace. The "Codex for (almost) everything" update added capabilities that push the product well beyond code completion: computer use on macOS (reading native application screens, executing clicks and keystrokes in apps like Excel, Outlook, and Salesforce), an in-app browser with annotation and web scraping, integrated gpt-image-1.5 image generation, persistent cross-session memory, scheduled automations that run parallel agent instances, and a 90+ plugin ecosystem . This context is essential for understanding why the Dell partnership is scoped the way it is — on-prem access isn't to a code assistant, it's to a multi-modal agent platform.

The plugin ecosystem covers Jira, the full Microsoft 365 suite, Notion, Slack, HubSpot, Salesforce, Google Workspace, GitHub, Linear, and Zendesk . Combined with computer use, Codex can drive workflows spanning a developer's full tool stack: reading a Jira ticket, writing code in a local IDE, pushing to GitHub, and logging status in Salesforce in a single coordinated session. That combination puts it in a different product category than a code assistant.

The usage numbers reflect this expansion. Codex has over 4 million weekly active developers , and ChatGPT Business and Enterprise tiers grew 6× between January and April 2026 . That growth trajectory makes the on-premises gap consequential: a large portion of potential enterprise users — particularly in regulated industries — couldn't access these capabilities because the cloud deployment model was a non-starter for their data governance teams.

Billing also shifted. Cloud Codex moved to token-based pricing on April 2, 2026, replacing per-message pricing with a credits-per-million-tokens model covering input tokens, cached input tokens, and output tokens . Existing ChatGPT Enterprise, Edu, Health, Gov, and ChatGPT for Teachers plans transitioned to this model on April 23, 2026. On-premises pricing under the Dell partnership has not been disclosed, but the shift to token-based cloud pricing provides a reference frame for what Dell enterprise procurement teams will likely negotiate against.

For enterprise teams evaluating the Dell path: if capability parity holds between the on-prem build and the current cloud release (unconfirmed — addressed in Section 8), they would be deploying a multi-modal agentic platform capable of driving cross-system business workflows — not a stripped-down code assistant running on local hardware.

Dell AI Data Platform: How Data Stays On-Premises

The Dell AI Data Platform is the component that makes the partnership technically credible for regulated enterprises. It provides a unified on-premises data layer through which Codex can access internal codebases, documentation, and operational knowledge without generating cloud egress . The architecture design intent is that Codex retrieval and context-building operations run against data sources that remain physically inside the customer's infrastructure boundary throughout the agent session.

The key technical component within the AI Data Platform is the Starburst integration. Starburst is a federated query engine built on Trino that can execute SQL across heterogeneous data sources — relational databases, data lakes, object stores, and others — without requiring data to be centralized or replicated. In the Dell–OpenAI configuration, this means Codex can run context-enrichment queries against proprietary schemas, incident history, internal documentation indexes, and operational databases in-place . The query executes on-premises; only the result set or relevant context fragment is passed to the Codex inference layer, which also runs on-premises.

For organizations under GDPR, HIPAA, or national data residency frameworks, this design closes a gap that has existed since Codex became capable of substantive enterprise use. Previously, using Codex with meaningful organizational context required either accepting cloud data transfer (a policy violation for many regulated entities) or providing the model with sanitized, context-stripped inputs — dramatically reducing usefulness. The on-prem data platform makes both compromises unnecessary.

Starburst's role is worth calling out for architects evaluating this stack: federated query does not require a data warehouse migration or a new ETL pipeline. Existing databases, file stores, and schema registries connect through Starburst catalogs. Organizations with fragmented on-premises data infrastructure — common in large enterprises and government agencies — do not need to consolidate data before deploying this configuration . The agent retrieves what it needs, in-place, without a centralization prerequisite.

The broader value is contextual completeness. Codex agents are more effective when they can query the specific codebase, schema documentation, and historical incident data relevant to the task at hand. With the Dell AI Data Platform, Codex retrieves this context directly from on-premises stores during the agent execution loop — developers and business users don't need to manually curate inputs for each session.

Dell AI Factory: Hardware Specs and Deployment Scale

Dell AI Factory is Dell's end-to-end framework for deploying AI workloads on customer-owned infrastructure. In the OpenAI partnership, it provides the inference compute layer: the physical servers where Codex model inference executes, co-located with the data sources it queries. The primary hardware platform is the PowerEdge server lineup — specifically the XE9680, XE9680L, and XE9812 — supporting NVIDIA HGX H100 and H200 GPUs as well as alternative accelerators .

Server Model	GPU Support	Form Factor	Primary Role in AI Factory
PowerEdge XE9680	NVIDIA HGX H100 (8× SXM5)	8U rackmount	LLM inference and training node
PowerEdge XE9680L	NVIDIA HGX H100 / H200 (8×)	8U (lower-profile)	High-density inference; space-constrained datacenters
PowerEdge XE9812	NVIDIA HGX H200 + alternative accelerators	12U chassis	Vector indexing, high-throughput token generation

Dell's vendor-cited performance benchmarks for the XE9812 include 12× faster vector indexing, 19× faster time-to-first-token, and 10× lower cost-per-token compared to Blackwell-generation alternatives . These figures have not been independently verified and should be treated as vendor-positioned numbers. They do indicate, however, that Dell is positioning the XE9812 as the preferred platform for Codex inference workloads where latency and throughput are primary concerns.

The existing install base is a significant distribution factor. As of March 2026, Dell AI Factory had over 4,000 enterprise customers . For OpenAI, this represents a large pool of organizations already operating compatible hardware — the partnership can extend Codex access without requiring customers to go through a new hardware procurement cycle. That substantially reduces the sales friction compared to building on-premises reach from scratch.

For organizations evaluating whether their existing Dell infrastructure qualifies, the determining factors are GPU configuration (H100 or H200 HGX modules are the primary targets), available rack space for gateway components, and network topology between compute and data tiers. Dell AI Factory's support for alternative accelerators provides some flexibility, but the partnership documentation does not specify which alternatives are confirmed for Codex inference workloads. Architects should engage Dell directly for hardware compatibility scoping before planning deployment timelines.

Gateway Architecture: How On-Prem Requests Route to Codex

The deployment model for on-premises Codex follows a gateway proxy pattern. Developer IDEs and client applications communicate with on-premises endpoints; those endpoints proxy requests to Dell-hosted Codex instances running on AI Factory hardware. From the developer's perspective, the intent is functional equivalence with cloud Codex — same API surface, same agent interaction model — but the inference path stays within the customer datacenter throughout .

Deployment Path	Infrastructure Owner	Data Boundary	Inference Latency Target	Compute Hardware
OpenAI Cloud (standard)	OpenAI	OpenAI cloud infrastructure	Not published	OpenAI-managed
Azure Private Endpoint	Microsoft (managed)	Azure VPC (Microsoft region)	Not published	Azure-managed
Dell AI Factory (on-prem)	Customer-owned	Customer datacenter	<100ms end-to-end	PowerEdge XE9680 / XE9812

The inference latency target for the Dell path is under 100 milliseconds end-to-end , covering the full round trip from IDE client to on-prem Codex endpoint and back. Whether this target holds at production scale depends on model size, GPU configuration, and how tightly compute and data layers are co-located within the datacenter. Dell has not published a comparative latency figure against cloud deployment; performance claims should be treated as unverified until production benchmarks from independent deployments are available.

A model variant labeled codex-enterprise-v1 is referenced in technical documentation for the on-premises deployment . OpenAI has not published an official version name for the on-premises build, and no capability parity statement exists comparing codex-enterprise-v1 to the current cloud Codex release. This label is a material data point: it suggests the on-prem build may be a distinct variant, not an identical mirror of the cloud product. Enterprise buyers evaluating specific features should treat capability parity as an open question until OpenAI publishes explicit documentation.

The architectural distinction from Azure private endpoints is worth being precise about. Azure private endpoints isolate network traffic through VPC tunneling, preventing it from traversing public internet. The compute still runs on Microsoft-managed Azure infrastructure in a Microsoft-controlled region. With the Dell path, PowerEdge servers physically sit in the customer's datacenter — the customer owns the hardware, controls physical access, and can implement network policies (including potential air-gap configurations) that are structurally impossible with any managed cloud service. Whether OpenAI supports fully air-gapped, internet-free operation on this path has not been confirmed .

Who This Deployment Path Is Actually For

The primary audience for the Dell on-prem path is organizations where cloud AI deployment is not a policy option — not a preference. Defense contractors operating under classification boundaries, central banks and sovereign wealth funds with jurisdictional data residency rules, healthcare payers under HIPAA or national health data frameworks, and government agencies with requirements that preclude sending operational data to hyperscaler infrastructure: these are the organizations that previously had no official path to Codex-class agent capabilities . The announcement is directly addressed at them.

A second category is enterprises with substantial existing Dell AI Factory infrastructure. For these organizations, the partnership extends the value of hardware already deployed and paid for. Rather than a new datacenter project, adding Codex to an existing AI Factory environment is an integration exercise. Dell's 4,000+ enterprise AI Factory customers as of March 2026 represent a distribution surface OpenAI gains access to without requiring new capital expenditure cycles from the customer side.

A third segment consists of organizations that aren't legally prohibited from cloud AI but face practical constraints: large codebases where round-trip latency from cloud inference degrades developer experience, or data governance policies that haven't been updated to allow AI vendor data processing but might approve on-premises deployments through a different governance track.

What this path does not solve: organizations looking for a cost-optimized Codex deployment (pricing is unannounced and on-prem hardware CapEx is not trivial), organizations that require FedRAMP, IL-5, or HIPAA certification before procurement (no compliance certifications have been published), and organizations that need fine-tuning or model customization on their own hardware (not addressed in any official documentation). These are distinct use cases the current announcement does not cover .

What Changes for Enterprise Engineering Teams

The most immediate change for engineering teams is the removal of the data locality constraint. With cloud Codex, the practical options for providing organizational context are to upload sanitized snippets or accept that full code and schema context will transit OpenAI infrastructure. For teams under IP protection policies or data governance frameworks, neither option was acceptable. The Dell path makes this constraint structural rather than policy-dependent: the data doesn't leave the building because the inference also happens in the building .

The use case expansion beyond coding is significant context. Given the April 2026 Codex capability surface, on-prem deployment enables workflows previously unavailable to regulated industries: lead qualification against CRM data stored on-premises, incident response workflows that read and update internal ticketing systems, report drafting that pulls from proprietary data stores, and cross-system workflow coordination across internal tools. These aren't hypothetical future applications — they are documented use cases for cloud Codex, now architecturally extended to organizations that couldn't access them .

The integration with Dell AI Factory pipelines opens a third vector: Codex co-location with AI inference infrastructure already running other workloads. Engineering teams that have deployed AI Factory for custom model fine-tuning, data preparation pipelines, or internal ML serving can potentially have Codex interface with those pipelines directly — using AI Factory for data prep and test execution as part of Codex-driven workflows. The partnership documentation describes this as an area of exploration rather than a confirmed shipping feature, but the architectural direction is stated .

The operational model also changes for DevOps and platform teams. Managing an on-premises Codex deployment means owning gateway infrastructure, the upgrade cadence, hardware health monitoring, and integration with internal access control systems — responsibilities abstracted away in cloud deployments. Teams evaluating this path need to budget for platform engineering overhead alongside hardware and licensing costs. This is not a drop-in replacement for cloud Codex; it is a new operational surface.

What's Still Unconfirmed: Gaps and Open Questions

The announcement confirmed partnership intent and described an architecture. It did not ship a product. As of May 2026, several material details remain unresolved, and enterprise buyers should be explicit about these gaps when building internal business cases .

Pricing: No pricing model has been published. It is unclear whether Dell bundles Codex licensing into AI Factory contracts, whether OpenAI invoices separately, or whether the model is CapEx-based (hardware plus perpetual license), OpEx-based (subscription per seat or token), or a hybrid. Cloud Codex moved to per-million-token billing in April 2026, which provides a reference frame, but on-premises pricing typically involves different cost structures. Regulated buyers should treat pricing as a negotiation item and engage Dell enterprise sales with specific volume projections.

Capability parity: The codex-enterprise-v1 label in technical documentation suggests the on-prem build may be a distinct variant. No official statement addresses whether the 90+ plugin ecosystem, computer use on macOS, gpt-image-1.5 integration, and parallel scheduled agents are available against on-premises endpoints . This is the most consequential open question for engineering teams evaluating on-prem use cases — the deployment value depends heavily on which features are actually available.

GA timeline and compliance certifications: No general availability date or beta program has been published as of May 2026 . FedRAMP, HIPAA, IL-5, and equivalent certifications for the on-premises configuration have not been mentioned in official documentation. Organizations where compliance authorization is a procurement prerequisite cannot start that process yet.

Model customization and air-gapped operation: Model weights, fine-tuning access, and fully offline (no internet connectivity) operation are not addressed in any official documentation. For defense and intelligence use cases, air-gap support is often a hard requirement. The current announcement neither confirms nor denies support — which means either it's planned but not announced, or it's out of scope for the initial deployment model. Either way, buyers who need this capability cannot assume it will be available.

Frequently Asked Questions

Is the OpenAI–Dell Codex on-premises deployment available now?

No. OpenAI and Dell announced the partnership at Dell Technologies World on May 18, 2026 as a stated intent and architectural direction — not a shipping product. No public general availability date, beta program, or early-access sign-up has been published. Organizations interested in the deployment path should engage Dell enterprise sales directly for timeline information, as details are not available through public channels as of the announcement date.

How does this differ from using OpenAI via Azure private endpoints?

Azure private endpoints route API traffic through VPC isolation, preventing it from traversing public internet. However, the compute still runs on Microsoft-managed Azure infrastructure in a Microsoft-controlled region — the physical hardware is not under customer control. The Dell AI Factory path runs Codex inference on customer-owned PowerEdge hardware (XE9680, XE9680L, or XE9812) physically located in the customer's datacenter. The customer owns and controls the hardware, data does not leave customer-controlled infrastructure, and network policy (including potential air-gap configurations) can be enforced at the physical layer. This is a structural difference, not a network-layer isolation distinction.

Can Codex query internal databases on-premises without data leaving the building?

Based on the partnership architecture as described, yes. The Dell AI Data Platform includes a Starburst integration for federated SQL queries. Starburst executes queries across heterogeneous on-premises data sources — relational databases, data lakes, file stores — without requiring data to be centralized or replicated to a cloud environment. In the on-prem Codex configuration, context retrieval runs against these sources in-place, with results passed to the Codex inference layer, which also runs on-premises. The design intent is full data locality with no cloud egress for query data. Specific guarantees depend on deployment configuration and contract terms with Dell.

Do you need Dell hardware to use this on-premises path?

Based on current documentation, yes. The integration is tied to Dell AI Factory, which deploys on PowerEdge XE9680, XE9680L, and XE9812 servers with NVIDIA HGX GPU modules. No support for third-party on-premises hardware — HPE, Lenovo, Supermicro, or others — has been announced as of May 2026. Organizations with established non-Dell datacenter footprints would need to either procure Dell hardware or wait for potential alternative hardware support announcements. This hardware dependency is a material consideration for organizations mid-lifecycle on existing server infrastructure.

What's the latency target for on-premises Codex inference?

Dell targets under 100 milliseconds end-to-end for on-premises Codex inference, covering the full round trip from developer IDE to on-prem Codex endpoint and back. Actual throughput and latency at scale depend on model size loaded, GPU configuration (H100 vs. H200, number of cards), and how tightly compute and data tiers are co-located within the datacenter. Dell has not published a comparative latency benchmark against cloud Codex, so whether on-prem inference is faster, slower, or equivalent for typical developer workloads remains unverified by independent measurement.

What to Track From Here

The Dell–OpenAI announcement establishes an architectural blueprint and signals OpenAI's intent to reach regulated enterprise customers through channel distribution. What is absent is delivery specification: a GA date, a capability inventory for codex-enterprise-v1, a pricing model, and a compliance certification roadmap. For technical founders and enterprise platform teams, the watch list is concrete — any one of those four items becoming public moves this from interesting architecture to actionable procurement evaluation.

The broader signal is worth reading accurately. OpenAI building a non-Azure, non-hyperscaler enterprise channel through Dell is a distribution bet: there is a large install base of Dell AI Factory customers who have been unreachable through cloud distribution models. If the on-prem product ships and the capability story holds up, it extends enterprise AI agent tooling to a segment that has been effectively closed since Codex matured into a general-purpose platform. Whether execution follows the announcement is the question the next two to three quarters will answer.

For teams currently blocked from Codex by data governance policies, the practical step right now is to register intent with Dell enterprise sales — not to plan a deployment, but to be positioned for beta access and to participate in pricing and capability conversations before terms are finalized. Early engagement typically provides more influence over deployment configuration and pricing structure than post-GA procurement.

Last updated: 2026-05-28. Article based on the Dell Technologies World announcement of May 18, 2026 and product documentation available through May 28, 2026. Partnership details, pricing, capability parity, and GA timeline remain unannounced; this article will be updated as Dell and OpenAI publish additional specifications.