AI Governance for Defense: Building Audit-Ready Systems Before Section 1513

Introduction

A $140M defense subcontractor runs an AI-powered predictive maintenance model on CNC equipment across two production facilities. The pilot worked. Production leadership approved expansion. The model now ingests machine vibration data, maintenance logs, and — because the engineering team found it improved accuracy — segments of technical data packages that include Controlled Unclassified Information (CUI). No one has documented which data feeds contain CUI markings. No one logs which employees access the model's outputs. No version control tracks which iteration of the model generated which recommendation. And no human approval gate exists between the model's maintenance predictions and the work orders that follow.

The AI works. The governance is weak, if non-existent.

This scenario is not hypothetical — it reflects a pattern emerging across the defense industrial base as manufacturers adopt AI tools faster than they establish the oversight structures those tools require. The FY2026 National Defense Authorization Act (NDAA), specifically Section 1513, directs the Department of War (DoW) to develop cybersecurity and physical security requirements for artificial intelligence and machine learning technologies used in defense contracting, with integration into both the Cybersecurity Maturity Model Certification (CMMC) program and the Defense Federal Acquisition Regulation Supplement (DFARS). The framework hasn't been published. But the direction is unambiguous — and the manufacturers who build AI governance architecture now, before the rules arrive, will satisfy those requirements at a fraction of the cost and disruption facing organizations that wait.

This article examines why AI governance for defense manufacturing has become an immediate operational requirement rather than a future compliance exercise, what the current regulatory landscape demands, and how mid-market subcontractors can build audit-ready AI systems that position them for both compliance and competitive advantage.

THE OVERSIGHT VACUUM

The Governance Gap: Why Defense Manufacturers Are Deploying AI Faster Than They Can Govern It

The primary driver of the AI governance gap in defense manufacturing is structural: the pressure to capture operational value from AI tools has outpaced the development of oversight mechanisms designed for those tools. This is not negligence — it is the predictable consequence of a technology adoption curve moving far faster than institutional governance cycles.

The Pilot-to-Production Acceleration Creates an Oversight Vacuum

BCG research found that 65% of aerospace and defense AI efforts remain in proof-of-concept. The initiatives that do advance beyond pilot stage often do so because a business unit champion pushed them forward — securing budget approval, demonstrating results, and building momentum. That momentum, in organizations without centralized AI architecture, frequently bypasses the governance conversation entirely. The pilot had no audit trail because pilots rarely do. When the pilot becomes a production system, the absence of critical security features carries forward. The governance vacuum that was acceptable for a three-month experiment becomes a compliance liability when the system progresses toward deployment readiness, where the ingestion of CUI and other security-constrained information may be required.

The pattern mirrors the broader pilot-to-production failure dynamic: organizations that treat AI deployment as a series of disconnected projects rather than components of enterprise architecture reproduce the same structural gaps at every stage of expansion.

What 'Ungoverned AI' Looks Like in a Defense Manufacturing Environment

Ungoverned AI is not necessarily malfunctioning AI. The predictive maintenance model in the opening scenario may produce acceptable results internally. The problem is that its reliability cannot be proven to an auditor, to a prime contractor, or to the DoW. Ungoverned AI in a defense manufacturing context cannot be considered production-ready, regardless of how well it performs. Without the appropriate governance elements incorporated into the environment, the liabilities are significant:

  • No inference logging. The system generates recommendations, but no metadata record captures which data inputs drove the output, which model version produced it, or which user received it.
  • No model version control. Engineering teams update the model iteratively, but no documentation tracks which version was active on which date — making it impossible to reconstruct a decision chain after the fact.
  • No CUI boundary enforcement. Data feeds expand organically as teams discover that adding more inputs improves accuracy, with no systematic check on whether new data sources introduce CUI into a system that lacks appropriate access controls.
  • No human-in-the-lead validation. Outputs flow directly into operational workflows without a documented approval authority for safety-critical or compliance-sensitive decisions.

The CUI Exposure Most Manufacturers Haven't Quantified

Each of those ungoverned conditions creates a potential CUI touchpoint that may not appear in the organization's existing CMMC assessment boundary. When a compliance team last mapped their CUI data flows, the AI system may not have existed — or it may have been a sandboxed pilot with no production data access. The expansion from pilot to production frequently introduces CUI into systems that were never evaluated for CMMC compliance, creating exposure that compounds with every new data integration, every model update, and every additional user granted access.

THE CONGRESSIONAL MANDATE

What NDAA Section 1513 Actually Says — and What It Signals

Section 1513 of the FY2026 NDAA directs the Secretary of War to develop cybersecurity and physical security requirements specifically for AI and machine learning technologies used in defense contracting, and to incorporate those requirements into CMMC and DFARS. This is not advisory language. It is a Congressional mandate with a status update due in June 2026.

The Legislative Text and Its Directives

The critical NDAA Section 1513 AI requirements for defense manufacturers are threefold.

  1. First, the section mandates the creation of AI-specific security standards — meaning the current CMMC framework will be supplemented with controls designed explicitly for AI systems, not inherited solely from general-purpose information system security.
  2. Second, it directs integration into CMMC and DFARS — meaning these standards will become contractual obligations through the same flow-down mechanisms that govern CUI handling today.
  3. Third, it establishes a reporting timeline — the Congressional status update creates a concrete milestone that signals the regulatory framework is actively under development, not languishing in committee.

The June 2026 Congressional Status Update

The status update will provide clarity on the framework's direction, scope, and anticipated timeline for incorporation into CMMC and DFARS. It should not, however, be treated as the starting point for action. Organizations that treat June 2026 as the signal to begin governance planning will find themselves designing architecture under time pressure against a moving regulatory target — the worst conditions for sound engineering decisions.

Why 'Wait and See' Is Architectural Debt, Not Prudence

The argument for waiting — 'let's see what the specific rules require before we invest' — mischaracterizes the nature of the cost. Every month an AI system operates without intentionally robust governance architecture, it generates inferences that cannot be retroactively traced, processes data through flows that cannot be retroactively logged, and makes decisions through pathways that cannot be retroactively documented. This is not a theoretical risk that materializes only when an auditor arrives. It is architectural debt — the accumulated gap between what a system should have been designed to do and what it actually does — and the cost of closing that gap grows with every additional month of ungoverned operation.

Retrofitting audit trail metadata capture into a production AI system requires modifying inference pipelines, revalidating data integrations, retraining operations teams, and potentially re-architecting access controls. Building those capabilities from inception costs a fraction of the retrofit.

Deferring the use of valuable CUI in AI business solutions to avoid the governance infrastructure challenge is merely another form of waiting. The question is not whether the investment is premature. The question is whether the organization can afford the compounding cost of delay while competitors build governed AI capabilities that compound their advantage.

THE PRESENT-TENSE OBLIGATION

CMMC Level 2 Already Applies to Your AI Systems

AI governance in defense manufacturing is not solely a forward-looking obligation created by Section 1513. Current CMMC Level 2 requirements, grounded in the 110 security controls of NIST Special Publication 800-171 Revision 2 (NIST SP 800-171 Rev 2), apply to any information system that processes, stores, or transmits CUI. The moment an AI tool ingests, analyzes, or generates output from CUI, it enters the CMMC assessment boundary.

NIST SP 800-171 Controls and the AI Assessment Boundary

Several control families within NIST SP 800-171 carry direct implications for AI systems. Access Control (AC) requires that system access be limited to authorized users and that access enforcement mechanisms be in place — applicable to who queries an AI model and who receives its outputs. Audit and Accountability (AU) requires that audit records be created, retained, and reviewed — applicable to every inference an AI system generates when it touches CUI. System and Information Integrity (SI) requires that security alerts and advisories be addressed and information system flaws be identified and corrected — applicable to model drift, hallucination, and output degradation in AI systems.

Where AI Tools Create New CUI Touchpoints

A Retrieval-Augmented Generation (RAG) system — a method that gives an AI model access to external data sources so it can ground its responses in specific information rather than relying solely on training data — indexing technical manuals creates a CUI touchpoint at the vector database. A predictive maintenance model analyzing production data that includes controlled specifications creates a CUI touchpoint at the data ingestion layer. A large language model (LLM) summarizing support tickets that reference controlled program details creates a CUI touchpoint at the inference layer. Each of these touchpoints expands the assessment boundary in ways the original CMMC evaluation may not have anticipated.

The Gap Between Current CMMC Posture and AI-Inclusive CMMC Posture

The critical question for most mid-market defense manufacturers is not 'are we CMMC compliant?' but rather 'does our CMMC compliance account for our AI systems?' For organizations that achieved or are pursuing CMMC Level 2 certification, the assessment boundary was drawn around their information systems as they existed at the time of evaluation. AI tools deployed, expanded, or anticipated after that assessment — or tools whose data inputs have expanded to include CUI since the assessment — may sit outside the documented compliance boundary. The gap between where compliance documentation says CUI flows and where it actually flows through AI systems represents unacknowledged risk that either constrains AI adoption or grows with each new AI deployment.

THE ARCHITECTURAL BLUEPRINT

Designing an Audit-Ready AI Architecture

An AI audit trail that satisfies CMMC Level 2 requirements is an architectural feature embedded across the inference pipeline — not a reporting dashboard assembled after the fact. The distinction matters because audit trail data must be captured at multiple points throughout the inference lifecycle: certain elements are resolved and logged before inference generation even begins, while others are recorded at the moment the output is produced and as it moves through downstream approval workflows. Source data lineage, user identity, and model version must be resolved at the point the query is initiated — before the model processes the request — because these elements establish the preconditions under which the inference will operate. Confidence scores and human approval records are captured at or after output generation. Post-hoc reconstruction from application logs produces gaps, approximations, and defensibility problems under audit scrutiny.

The Six Metadata Elements Every AI Inference Must Capture

For any AI inference that touches CUI, the system architecture should capture at least six metadata elements:

  1. Source data lineage. Which specific data inputs — documents, database records, sensor feeds — contributed to this inference? Traceability back to source documents is foundational for both CMMC Audit and Accountability (AU) controls and the reproducibility requirements that Section 1513 will formalize.
  2. Model version. Which specific version of the model — including fine-tuning iterations, prompt templates, or retrieval configurations — generated this output? Model versioning enables organizations to reconstruct the conditions under which a particular decision was made.
  3. Timestamp. When was the inference generated? Timestamping sounds trivial; it becomes critical when an auditor asks whether a particular output was generated before or after a model update that changed the system's behavior.
  4. User identity. Which authenticated user initiated the query and received the output? NIST SP 800-171's Access Control family requires that system access be traceable to individual identities — a requirement that extends to AI system interaction.
  5. Confidence score. What was the model's self-assessed reliability for this output? Confidence scoring creates a documented basis for human review decisions: outputs below a defined threshold trigger mandatory human validation.
  6. Human approval authority. For outputs that inform safety-critical or compliance-sensitive decisions, which authorized individual reviewed and approved the output before it entered an operational workflow?

Additional metadata elements that organizations may consider depending on system complexity and risk profile include data classification tagging, prompt or query content logging, output disposition records, environmental context, and retrieval source identification.

How to Build an AI Audit Trail That Satisfies CMMC Level 2

Satisfying CMMC Level 2 audit requirements for AI systems means designing the metadata capture layer as an integral component of the inference pipeline — not as an optional logging feature. Architecturally, this involves intercepting the inference at initiation, following it through generation, capturing and attaching the six core metadata elements as structured records, and writing those records to a tamper-evident audit store with retention policies aligned to NIST AU-2 (Audit Events) and AU-3 (Content of Audit Records). The audit store itself must meet the same CMMC controls as any other CUI-handling system: access-controlled, encrypted, and subject to regular review.

Model Versioning, Data Lineage, and Human Approval Authority

Three of the six elements warrant additional depth. Model versioning in AI systems is more complex than traditional software versioning because model behavior can change not only through code updates but through changes in training data, retrieval corpora, and prompt configurations. Effective versioning tracks all three dimensions.

Data lineage for AI systems extends beyond conventional data flow mapping because RAG architectures, fine-tuning pipelines, and multi-source ingestion systems create branching data paths that may intermingle controlled and uncontrolled information. Lineage tracking must capture the classification status of each data source at the point of ingestion — a requirement with direct implications for organizations handling International Traffic in Arms Regulations (ITAR)-controlled technical data, where the constraints on AI data processing carry additional export control obligations.

Human approval authority is the governance mechanism that translates AI output into organizational action. Documenting which individual held approval authority for which output establishes the accountability chain that auditors — whether DCMA assessors, prime contractor quality teams, or third-party CMMC assessors — will evaluate.

Audit Trail as Reporting Versus Audit Trail as Architecture

The difference between these two approaches defines the difference between governance that survives scrutiny and governance that collapses under pressure. A reporting-layer audit trail sits outside the inference pipeline. It assembles log data after the fact, often from multiple disconnected sources, and presents it in a dashboard format. This approach is fragile: if any log source fails, the trail has gaps. If the reporting system is configured incorrectly, the trail is inaccurate. If the underlying logs are not tamper-evident, the trail is challengeable.

An architectural audit trail is embedded in the inference pipeline itself. Every inference passes through the metadata capture layer before reaching the output interface. Gaps become structurally impossible because metadata attachment is a precondition for output delivery. This is the approach that builds a defensible AI audit trail for CMMC — and the approach positioned to satisfy whatever specific provisions the Section 1513 framework ultimately mandates.

THE WATCH DOG

Continuous Compliance Monitoring for AI Systems

AI compliance monitoring for manufacturing environments requires a fundamentally different cadence than traditional information system compliance. Conventional compliance monitoring operates on audit cycles — annual assessments, quarterly reviews, periodic vulnerability scans. These cadences were designed for systems whose behavior is deterministic and whose configurations change through managed change-control processes.

Why Periodic Audits Fail for Continuously Generating Systems

AI systems do not behave deterministically. A model that performs reliably in January may exhibit drift by April — producing subtly degraded outputs as the statistical distribution of its inputs shifts relative to its training data. A RAG system that correctly handles CUI boundaries in March may develop a leakage pathway in June when a new data source is added to its retrieval knowledge base. A prompt injection vulnerability may be exploited in month seven and remain undiscovered until the next annual assessment in month twelve. The risk velocity of AI systems exceeds the detection cadence of periodic audits.

Real-Time Monitoring for Model Drift, Data Leakage, and Access Anomalies

Continuous monitoring for AI systems encompasses three domains. First, model drift detection: automated statistical comparison of current model outputs against validated baselines, flagging significant distributional shifts that may indicate degraded performance or changed behavior. Second, data leakage monitoring: scanning AI outputs, tool calls, and retrieval pathways for CUI markers, classification indicators, and controlled terminology that should not appear in uncontrolled output channels. Third, access anomaly detection: identifying unusual patterns in who queries the AI system, when, and with what types of inputs — patterns that may indicate unauthorized use, prompt injection attempts, or credential compromise.

The Governance & Compliance Monitoring Dimension

These continuous monitoring functions are not supplementary features to be considered after the core AI system is built. They represent a mandatory functional dimension of any defense manufacturing AI platform. Within Inflectis's 5×5 architectural framework, Governance & Compliance Monitoring operates as one of five functional dimensions that must be addressed across the entire technology stack — from infrastructure through data, models, agents, and applications. The functions within this dimension include proprietary information sanitization, safety filtering, explainability generation, and drift monitoring. Organizations that defer this dimension to a later implementation phase are building AI systems that are structurally incapable of demonstrating compliance until they are re-architected — the same retrofit problem that makes waiting for Section 1513 so costly.

THE FORWARD READ

Interpreting Section 1513 Before the Framework Arrives

Defense contractors do not need to wait for the published Section 1513 framework to begin building governance architecture. The direction of NDAA Section 1513 AI requirements is interpretable from three converging sources: the legislative text itself, existing DoW AI strategy, and established NIST frameworks that will almost certainly inform the final standards.

Principles Knowable from the Legislative Text and Existing NIST Frameworks

Section 1513's mandate to develop AI-specific cybersecurity requirements, combined with its directive to integrate those requirements into CMMC and DFARS, signals that the resulting framework will extend — not replace — existing NIST-based controls. The NIST AI Risk Management Framework (AI RMF 1.0), published January 2023, provides the most probable structural influence on the forthcoming standards. The AI RMF organizes AI risk management around four functions: Govern, Map, Measure, and Manage. Each function emphasizes principles that defense manufacturers can build against now:

  • Traceability: Every AI decision must be traceable to its data inputs, model configuration, and human oversight. This principle is already embedded in the six metadata elements described above.
  • Human oversight: AI systems must incorporate mechanisms for human review, intervention, and override — particularly for decisions with safety or compliance implications.
  • Risk proportionality: Governance rigor should scale with the consequences of AI system failure. A model that schedules preventive maintenance carries different governance requirements than a model that classifies ITAR-controlled data.

What DoW AI Strategy Signals About Direction

The Department of War Artificial Intelligence Strategy emphasizes responsible AI principles, data-centric approaches, and assured performance in operational contexts. For defense subcontractors, the strategic emphasis on 'assured performance' is significant — it signals that the DoW expects contractors to demonstrate not just that their AI systems work, but that they work reliably, traceably, and within documented parameters. That expectation aligns with continuous monitoring and audit trail architecture, not with periodic assessments of static configurations.

Building for Adaptability

An AI governance architecture built on traceability, human oversight, continuous monitoring, and risk-proportional controls will satisfy the principles underlying the Section 1513 framework regardless of the specific provisions it contains — because those principles are what the framework is codifying. Organizations that build to these principles create governance infrastructure that absorbs new requirements through configuration adjustments rather than re-engineering. That adaptability is the structural advantage of acting before the framework is published rather than after.

THE STRATEGIC EDGE

Governance as Competitive Advantage in Contract Competition

Audit-ready AI governance for defense manufacturing evolves from a compliance cost into a competitive differentiator when it is visible to the organizations that influence contract awards.

How Audit-Ready AI Governance Differentiates in Prime-Subcontractor Relationships

Prime contractors bear flow-down responsibility for subcontractor compliance. A subcontractor that can demonstrate a documented, audit-ready AI governance program — with traceable inference chains, continuous monitoring, and human-in-the-lead validation — de-risks the prime's compliance exposure. In a competitive bid where two subcontractors offer comparable technical capability and pricing, the one that provides evidence of governed AI operations reduces the prime's audit burden and contractual risk. That reduction translates into supplier preference.

The Trust Signal Compliance Posture Sends

As AI capability becomes a more prominent factor in defense contract evaluation, DoW customers and their assessment organizations will increasingly distinguish between contractors who can prove their AI systems operate within documented, auditable parameters and those who cannot. Governance posture becomes a trust signal — evidence that the organization treats AI as a managed enterprise capability rather than an uncontrolled collection of departmental experiments.

From Cost Center to Contract-Winning Capability

The strategic roadmap for mid-market defense subcontractors must position governance not as a line item in the compliance budget but as a component of the organization's competitive value proposition. The investment in AI audit trails, continuous monitoring, and human oversight architecture generates returns not only through risk avoidance — by preventing compliance failures and their associated costs — but through revenue capture, by strengthening the organization's position in contract competitions where AI governance maturity is evaluated. The manufacturer that builds governance now is not preparing for a regulation. It is building a capability that competitors who defer will spend years attempting to replicate.

Frequently Asked Questions

Do we need AI governance if we're only running a single AI pilot?
If that pilot processes, stores, or transmits CUI, it falls within the CMMC assessment boundary under current requirements — the 110 NIST SP 800-171 Rev 2 controls apply regardless of whether the system is labeled "pilot" or "production." The forthcoming Section 1513 framework will add AI-specific requirements on top of that existing baseline. Governance obligations are triggered by what the system touches, not by what the organization calls it.
What AI governance documentation should we start building now before the Section 1513 framework is finalized?
The highest-leverage starting point is architecting the system to tag every AI interaction with six metadata elements: source data lineage, model version, timestamp, user identity, confidence score, and human approval authority. This metadata structure satisfies current CMMC Level 2 audit requirements for systems processing CUI and anticipates the traceability standards that Section 1513 will formalize based on the direction established in the legislative text and the NIST AI Risk Management Framework.
Will the NDAA Section 1513 framework apply to all defense subcontractors, or only primes?
Section 1513 directs the DoW to incorporate AI security requirements into CMMC and DFARS — both of which utilize flow-down contract clauses that extend compliance obligations from primes to subcontractors. The scope of applicability will follow these established flow-down mechanisms, meaning any subcontractor handling CUI and deploying AI systems should anticipate inclusion in the framework's requirements.
How is AI governance different from regular CMMC compliance?
CMMC Level 2's 110 NIST SP 800-171 controls address information system security broadly — access management, incident response, configuration management, and related domains. AI governance adds requirements specific to AI system characteristics: model versioning and change tracking, inference-level traceability, data lineage through training and retrieval pipelines, human-in-the-lead validation for safety-critical or compliance-sensitive outputs, and continuous monitoring for model drift. Section 1513 will formalize these AI-specific governance layers on top of the existing CMMC baseline rather than replacing it.
Can we retrofit AI governance into systems that are already deployed?
Retrofitting is technically possible but substantially more expensive and disruptive than building governance from inception. Audit trail metadata capture must be embedded at the inference layer — the point where the AI system generates its outputs — to be reliable and tamper-resistant. Adding this capability to a production system requires modifying inference pipelines, revalidating data integrations, retraining operations staff on new workflows, and potentially re-architecting access controls. Building governance into the initial architecture typically represents a fraction of total deployment cost, while retrofitting can approach or exceed the original deployment investment depending on system complexity.