
Scott Trunkett
April 28, 2026
•
20
min read

A $140M defense subcontractor runs an AI-powered predictive maintenance model on CNC equipment across two production facilities. The pilot worked. Production leadership approved expansion. The model now ingests machine vibration data, maintenance logs, and — because the engineering team found it improved accuracy — segments of technical data packages that include Controlled Unclassified Information (CUI). No one has documented which data feeds contain CUI markings. No one logs which employees access the model's outputs. No version control tracks which iteration of the model generated which recommendation. And no human approval gate exists between the model's maintenance predictions and the work orders that follow.
The AI works. The governance is weak, if non-existent.
This scenario is not hypothetical — it reflects a pattern emerging across the defense industrial base as manufacturers adopt AI tools faster than they establish the oversight structures those tools require. The FY2026 National Defense Authorization Act (NDAA), specifically Section 1513, directs the Department of War (DoW) to develop cybersecurity and physical security requirements for artificial intelligence and machine learning technologies used in defense contracting, with integration into both the Cybersecurity Maturity Model Certification (CMMC) program and the Defense Federal Acquisition Regulation Supplement (DFARS). The framework hasn't been published. But the direction is unambiguous — and the manufacturers who build AI governance architecture now, before the rules arrive, will satisfy those requirements at a fraction of the cost and disruption facing organizations that wait.
This article examines why AI governance for defense manufacturing has become an immediate operational requirement rather than a future compliance exercise, what the current regulatory landscape demands, and how mid-market subcontractors can build audit-ready AI systems that position them for both compliance and competitive advantage.
The primary driver of the AI governance gap in defense manufacturing is structural: the pressure to capture operational value from AI tools has outpaced the development of oversight mechanisms designed for those tools. This is not negligence — it is the predictable consequence of a technology adoption curve moving far faster than institutional governance cycles.
BCG research found that 65% of aerospace and defense AI efforts remain in proof-of-concept. The initiatives that do advance beyond pilot stage often do so because a business unit champion pushed them forward — securing budget approval, demonstrating results, and building momentum. That momentum, in organizations without centralized AI architecture, frequently bypasses the governance conversation entirely. The pilot had no audit trail because pilots rarely do. When the pilot becomes a production system, the absence of critical security features carries forward. The governance vacuum that was acceptable for a three-month experiment becomes a compliance liability when the system progresses toward deployment readiness, where the ingestion of CUI and other security-constrained information may be required.
The pattern mirrors the broader pilot-to-production failure dynamic: organizations that treat AI deployment as a series of disconnected projects rather than components of enterprise architecture reproduce the same structural gaps at every stage of expansion.
Ungoverned AI is not necessarily malfunctioning AI. The predictive maintenance model in the opening scenario may produce acceptable results internally. The problem is that its reliability cannot be proven to an auditor, to a prime contractor, or to the DoW. Ungoverned AI in a defense manufacturing context cannot be considered production-ready, regardless of how well it performs. Without the appropriate governance elements incorporated into the environment, the liabilities are significant:
Each of those ungoverned conditions creates a potential CUI touchpoint that may not appear in the organization's existing CMMC assessment boundary. When a compliance team last mapped their CUI data flows, the AI system may not have existed — or it may have been a sandboxed pilot with no production data access. The expansion from pilot to production frequently introduces CUI into systems that were never evaluated for CMMC compliance, creating exposure that compounds with every new data integration, every model update, and every additional user granted access.
Section 1513 of the FY2026 NDAA directs the Secretary of War to develop cybersecurity and physical security requirements specifically for AI and machine learning technologies used in defense contracting, and to incorporate those requirements into CMMC and DFARS. This is not advisory language. It is a Congressional mandate with a status update due in June 2026.
The critical NDAA Section 1513 AI requirements for defense manufacturers are threefold.
The status update will provide clarity on the framework's direction, scope, and anticipated timeline for incorporation into CMMC and DFARS. It should not, however, be treated as the starting point for action. Organizations that treat June 2026 as the signal to begin governance planning will find themselves designing architecture under time pressure against a moving regulatory target — the worst conditions for sound engineering decisions.
The argument for waiting — 'let's see what the specific rules require before we invest' — mischaracterizes the nature of the cost. Every month an AI system operates without intentionally robust governance architecture, it generates inferences that cannot be retroactively traced, processes data through flows that cannot be retroactively logged, and makes decisions through pathways that cannot be retroactively documented. This is not a theoretical risk that materializes only when an auditor arrives. It is architectural debt — the accumulated gap between what a system should have been designed to do and what it actually does — and the cost of closing that gap grows with every additional month of ungoverned operation.
Retrofitting audit trail metadata capture into a production AI system requires modifying inference pipelines, revalidating data integrations, retraining operations teams, and potentially re-architecting access controls. Building those capabilities from inception costs a fraction of the retrofit.
Deferring the use of valuable CUI in AI business solutions to avoid the governance infrastructure challenge is merely another form of waiting. The question is not whether the investment is premature. The question is whether the organization can afford the compounding cost of delay while competitors build governed AI capabilities that compound their advantage.
AI governance in defense manufacturing is not solely a forward-looking obligation created by Section 1513. Current CMMC Level 2 requirements, grounded in the 110 security controls of NIST Special Publication 800-171 Revision 2 (NIST SP 800-171 Rev 2), apply to any information system that processes, stores, or transmits CUI. The moment an AI tool ingests, analyzes, or generates output from CUI, it enters the CMMC assessment boundary.
Several control families within NIST SP 800-171 carry direct implications for AI systems. Access Control (AC) requires that system access be limited to authorized users and that access enforcement mechanisms be in place — applicable to who queries an AI model and who receives its outputs. Audit and Accountability (AU) requires that audit records be created, retained, and reviewed — applicable to every inference an AI system generates when it touches CUI. System and Information Integrity (SI) requires that security alerts and advisories be addressed and information system flaws be identified and corrected — applicable to model drift, hallucination, and output degradation in AI systems.
A Retrieval-Augmented Generation (RAG) system — a method that gives an AI model access to external data sources so it can ground its responses in specific information rather than relying solely on training data — indexing technical manuals creates a CUI touchpoint at the vector database. A predictive maintenance model analyzing production data that includes controlled specifications creates a CUI touchpoint at the data ingestion layer. A large language model (LLM) summarizing support tickets that reference controlled program details creates a CUI touchpoint at the inference layer. Each of these touchpoints expands the assessment boundary in ways the original CMMC evaluation may not have anticipated.
The critical question for most mid-market defense manufacturers is not 'are we CMMC compliant?' but rather 'does our CMMC compliance account for our AI systems?' For organizations that achieved or are pursuing CMMC Level 2 certification, the assessment boundary was drawn around their information systems as they existed at the time of evaluation. AI tools deployed, expanded, or anticipated after that assessment — or tools whose data inputs have expanded to include CUI since the assessment — may sit outside the documented compliance boundary. The gap between where compliance documentation says CUI flows and where it actually flows through AI systems represents unacknowledged risk that either constrains AI adoption or grows with each new AI deployment.
An AI audit trail that satisfies CMMC Level 2 requirements is an architectural feature embedded across the inference pipeline — not a reporting dashboard assembled after the fact. The distinction matters because audit trail data must be captured at multiple points throughout the inference lifecycle: certain elements are resolved and logged before inference generation even begins, while others are recorded at the moment the output is produced and as it moves through downstream approval workflows. Source data lineage, user identity, and model version must be resolved at the point the query is initiated — before the model processes the request — because these elements establish the preconditions under which the inference will operate. Confidence scores and human approval records are captured at or after output generation. Post-hoc reconstruction from application logs produces gaps, approximations, and defensibility problems under audit scrutiny.
For any AI inference that touches CUI, the system architecture should capture at least six metadata elements:
Additional metadata elements that organizations may consider depending on system complexity and risk profile include data classification tagging, prompt or query content logging, output disposition records, environmental context, and retrieval source identification.
Satisfying CMMC Level 2 audit requirements for AI systems means designing the metadata capture layer as an integral component of the inference pipeline — not as an optional logging feature. Architecturally, this involves intercepting the inference at initiation, following it through generation, capturing and attaching the six core metadata elements as structured records, and writing those records to a tamper-evident audit store with retention policies aligned to NIST AU-2 (Audit Events) and AU-3 (Content of Audit Records). The audit store itself must meet the same CMMC controls as any other CUI-handling system: access-controlled, encrypted, and subject to regular review.
Three of the six elements warrant additional depth. Model versioning in AI systems is more complex than traditional software versioning because model behavior can change not only through code updates but through changes in training data, retrieval corpora, and prompt configurations. Effective versioning tracks all three dimensions.
Data lineage for AI systems extends beyond conventional data flow mapping because RAG architectures, fine-tuning pipelines, and multi-source ingestion systems create branching data paths that may intermingle controlled and uncontrolled information. Lineage tracking must capture the classification status of each data source at the point of ingestion — a requirement with direct implications for organizations handling International Traffic in Arms Regulations (ITAR)-controlled technical data, where the constraints on AI data processing carry additional export control obligations.
Human approval authority is the governance mechanism that translates AI output into organizational action. Documenting which individual held approval authority for which output establishes the accountability chain that auditors — whether DCMA assessors, prime contractor quality teams, or third-party CMMC assessors — will evaluate.
The difference between these two approaches defines the difference between governance that survives scrutiny and governance that collapses under pressure. A reporting-layer audit trail sits outside the inference pipeline. It assembles log data after the fact, often from multiple disconnected sources, and presents it in a dashboard format. This approach is fragile: if any log source fails, the trail has gaps. If the reporting system is configured incorrectly, the trail is inaccurate. If the underlying logs are not tamper-evident, the trail is challengeable.
An architectural audit trail is embedded in the inference pipeline itself. Every inference passes through the metadata capture layer before reaching the output interface. Gaps become structurally impossible because metadata attachment is a precondition for output delivery. This is the approach that builds a defensible AI audit trail for CMMC — and the approach positioned to satisfy whatever specific provisions the Section 1513 framework ultimately mandates.
AI compliance monitoring for manufacturing environments requires a fundamentally different cadence than traditional information system compliance. Conventional compliance monitoring operates on audit cycles — annual assessments, quarterly reviews, periodic vulnerability scans. These cadences were designed for systems whose behavior is deterministic and whose configurations change through managed change-control processes.
AI systems do not behave deterministically. A model that performs reliably in January may exhibit drift by April — producing subtly degraded outputs as the statistical distribution of its inputs shifts relative to its training data. A RAG system that correctly handles CUI boundaries in March may develop a leakage pathway in June when a new data source is added to its retrieval knowledge base. A prompt injection vulnerability may be exploited in month seven and remain undiscovered until the next annual assessment in month twelve. The risk velocity of AI systems exceeds the detection cadence of periodic audits.
Continuous monitoring for AI systems encompasses three domains. First, model drift detection: automated statistical comparison of current model outputs against validated baselines, flagging significant distributional shifts that may indicate degraded performance or changed behavior. Second, data leakage monitoring: scanning AI outputs, tool calls, and retrieval pathways for CUI markers, classification indicators, and controlled terminology that should not appear in uncontrolled output channels. Third, access anomaly detection: identifying unusual patterns in who queries the AI system, when, and with what types of inputs — patterns that may indicate unauthorized use, prompt injection attempts, or credential compromise.
These continuous monitoring functions are not supplementary features to be considered after the core AI system is built. They represent a mandatory functional dimension of any defense manufacturing AI platform. Within Inflectis's 5×5 architectural framework, Governance & Compliance Monitoring operates as one of five functional dimensions that must be addressed across the entire technology stack — from infrastructure through data, models, agents, and applications. The functions within this dimension include proprietary information sanitization, safety filtering, explainability generation, and drift monitoring. Organizations that defer this dimension to a later implementation phase are building AI systems that are structurally incapable of demonstrating compliance until they are re-architected — the same retrofit problem that makes waiting for Section 1513 so costly.
Defense contractors do not need to wait for the published Section 1513 framework to begin building governance architecture. The direction of NDAA Section 1513 AI requirements is interpretable from three converging sources: the legislative text itself, existing DoW AI strategy, and established NIST frameworks that will almost certainly inform the final standards.
Section 1513's mandate to develop AI-specific cybersecurity requirements, combined with its directive to integrate those requirements into CMMC and DFARS, signals that the resulting framework will extend — not replace — existing NIST-based controls. The NIST AI Risk Management Framework (AI RMF 1.0), published January 2023, provides the most probable structural influence on the forthcoming standards. The AI RMF organizes AI risk management around four functions: Govern, Map, Measure, and Manage. Each function emphasizes principles that defense manufacturers can build against now:
The Department of War Artificial Intelligence Strategy emphasizes responsible AI principles, data-centric approaches, and assured performance in operational contexts. For defense subcontractors, the strategic emphasis on 'assured performance' is significant — it signals that the DoW expects contractors to demonstrate not just that their AI systems work, but that they work reliably, traceably, and within documented parameters. That expectation aligns with continuous monitoring and audit trail architecture, not with periodic assessments of static configurations.
An AI governance architecture built on traceability, human oversight, continuous monitoring, and risk-proportional controls will satisfy the principles underlying the Section 1513 framework regardless of the specific provisions it contains — because those principles are what the framework is codifying. Organizations that build to these principles create governance infrastructure that absorbs new requirements through configuration adjustments rather than re-engineering. That adaptability is the structural advantage of acting before the framework is published rather than after.
Audit-ready AI governance for defense manufacturing evolves from a compliance cost into a competitive differentiator when it is visible to the organizations that influence contract awards.
Prime contractors bear flow-down responsibility for subcontractor compliance. A subcontractor that can demonstrate a documented, audit-ready AI governance program — with traceable inference chains, continuous monitoring, and human-in-the-lead validation — de-risks the prime's compliance exposure. In a competitive bid where two subcontractors offer comparable technical capability and pricing, the one that provides evidence of governed AI operations reduces the prime's audit burden and contractual risk. That reduction translates into supplier preference.
As AI capability becomes a more prominent factor in defense contract evaluation, DoW customers and their assessment organizations will increasingly distinguish between contractors who can prove their AI systems operate within documented, auditable parameters and those who cannot. Governance posture becomes a trust signal — evidence that the organization treats AI as a managed enterprise capability rather than an uncontrolled collection of departmental experiments.
The strategic roadmap for mid-market defense subcontractors must position governance not as a line item in the compliance budget but as a component of the organization's competitive value proposition. The investment in AI audit trails, continuous monitoring, and human oversight architecture generates returns not only through risk avoidance — by preventing compliance failures and their associated costs — but through revenue capture, by strengthening the organization's position in contract competitions where AI governance maturity is evaluated. The manufacturer that builds governance now is not preparing for a regulation. It is building a capability that competitors who defer will spend years attempting to replicate.