** A quick note before we get into it: nothing in this article is an endorsement of any specific product or vendor. I'm referencing real platforms because abstraction is a great way to avoid hard conversations, and we don't do that here. Authorization statuses change, pricing changes, capabilities change. Verify everything against current documentation before your organization makes any procurement decisions. Consider this analysis, not advice.
There's a moment in every compliance conversation where someone in the room says, "We're using AI to help with that." They say it the way you'd say you hired a contractor to fix the roof. Confident. Final. Slightly proud of themselves. Problem solved. Next agenda item.
The room moves on. The question doesn't.
Because I've been in that room. Many times. And every single time, nobody follows up with "great, so how does that AI tool interact with your CUI?" They just nod and update the project tracker. It's the cybersecurity equivalent of everyone in a horror movie agreeing the strange noise in the basement was probably just the cat. Spoiler: there is no cat.
When the Department of Defense formalized CMMC Level 2, they weren't inventing new requirements. They were finally enforcing existing ones, specifically the 110 controls drawn from NIST SP 800-171. Controls built to answer one fundamental question: where is your Controlled Unclassified Information (CUI), and who, or what, can touch it? For years, that question had predictable answers. Humans. Systems you owned. Networks you controlled. The boundary was hard. You could draw it on a whiteboard and everybody in the room could point to it. Those were simpler times, uncomfortable and expensive compliance-heavy times, but at least we all agreed on where the edge of the map was.
AI changed the geometry. And most organizations haven't updated their map.
Retrieval-Augmented Generation, what most practitioners are calling RAG, is now the architecture of choice when organizations want large language models to work with internal knowledge. Documents, case files, technical specifications, the stuff that actually matters. Instead of retraining the model on sensitive data, you build a retrieval layer. The model pulls from a curated knowledge base at query time, generates a response, returns the answer, and everyone applauds because it sounds like magic. And to be fair, it kind of is. It's also kind of a compliance minefield wearing a very convincing disguise.
The problem isn't the architecture. The architecture is genuinely clever. The problem is what the architecture touches, and whether anyone has bothered to map that contact to the 110 controls they're contractually obligated to enforce.
Think of CUI the way you'd think about the One Ring in Lord of the Rings. It doesn't matter how good your intentions are or how responsible you believe yourself to be. The moment it starts passing through hands you didn't plan for, through systems you didn't account for, you've got a problem that no amount of retroactive documentation is going to fix. Chain of custody isn't a metaphor in the CMMC world. It's a legal and regulatory obligation, and Sauron in this analogy is your C3PAO assessor. He sees everything. He is not charmed by your slide deck.
So when a RAG system retrieves a fragment of a subcontract clause marked CUI, feeds it into a language model, and synthesizes it into a response that lands in a chat window... what is the chain of custody for that moment? Who logged it? Who authorized it? Where did it go after the screen refreshed? Most organizations genuinely can't answer that. Not because they're negligent. Because the question is new, and the frameworks were written before it existed. NIST SP 800-171 was not drafted with a committee member raising their hand to ask "but what about the embedding vector index?" That conversation hadn't happened yet.
This is where the infrastructure conversation becomes important, because not all AI platforms are created equal, and some of them are genuinely trying to solve this problem the right way. Amazon Bedrock running inside AWS GovCloud, Azure OpenAI Service inside Azure Government, and Vertex AI on Google's FedRAMP-authorized environment are the three platforms currently doing the heavy lifting for organizations that need to keep CUI inside an authorized boundary while still getting real value from large language models. Each of them offers FedRAMP-authorized infrastructure, data residency controls, and the kind of audit logging that gives your compliance documentation something to point to. They represent different paths to the same destination, and the right choice for your organization usually comes down to one thing: where does your CUI already live? If you're deep in the Microsoft ecosystem, Azure OpenAI Service is a natural fit. If your infrastructure runs on AWS, Bedrock makes sense. If you're a Google shop, Vertex AI has native RAG tooling and grounding features worth serious consideration. The model name on the label matters less than the boundary it operates inside.
Then there's a different category of platform entirely, the enterprise AI search layer, and this is where a product like Glean enters the conversation. Glean is fundamentally a RAG system, though their marketing team would probably prefer a more elegant description. It connects to your organization's data sources, indexes and embeds that content into a vector store, and at query time retrieves semantically relevant chunks to feed into an LLM. What makes Glean interesting from a security standpoint is that they've built permission-awareness into the retrieval layer. If you can't read a document directly in SharePoint, Glean theoretically won't surface it in a generated answer. That's the right instinct, and it addresses one of the core concerns I'd raise about any RAG system operating near CUI. Glean is actively working toward FedRAMP authorization, which tells you they understand the regulated market they're pursuing. But "actively working toward" and "authorized" are not the same sentence, and in a CMMC Level 2 context they are separated by a gap you cannot step across. Until that authorization lands, CUI stays out of it. Full stop. This isn't a criticism of Glean. It's a description of the rules.
And if someone's trying to sell you any AI compliance solution that isn't living in a FedRAMP-authorized environment, you should be asking uncomfortable questions immediately, possibly before the sales call ends, definitely before you sign anything.
But CMMC Level 2 doesn't evaluate infrastructure in isolation. It evaluates control. Those are different things, and conflating them is how organizations sail through their System Security Plan with a confident smile and then get absolutely humbled during an assessment.
Control 3.1.1 asks you to limit system access to authorized users and processes. A RAG pipeline is a process. What authorizes it? What limits it? If the retrieval layer can surface any document in the corpus that semantically matches a query, regardless of whether the person asking is supposed to see that document, is that access control? Or is it the absence of access control wearing a lab coat and claiming to be science? Control 3.13.3 asks you to separate user functionality from system management functionality. In an AI pipeline, where does the user end and the system begin? When a prompt becomes a retrieval query becomes a synthesized response, the boundary blurs in ways that don't fit neatly into traditional architecture diagrams. You can't just draw a box around the LLM and call it a day. The box has tentacles.
These aren't hypotheticals. They're audit findings waiting to happen. I say that with the calm certainty of someone who has watched very confident people walk into assessment rooms with incomplete documentation and walk out considerably less confident.
The chain of custody for AI-assisted CUI work has to be treated the same way you'd treat physical document handling in a cleared facility. Deliberate. Documented. Defensible. The retrieval layer needs to enforce the same access controls as the source systems. If a user isn't authorized to read a particular class of document, the RAG system's embedding search shouldn't be able to surface it for them, even indirectly, even as a fragment embedded inside a larger generated paragraph. Semantic retrieval is still retrieval. The fact that it happens automatically and invisibly doesn't make it authorized. That's not a technicality. That's the whole point.
Your logging strategy has to evolve too, and this is the part where I watch eyes glaze over in every briefing I give, which is exactly why I'm going to make it interesting. Controls 3.3.1 and 3.3.2 require audit logs of user activity and the ability to trace unauthorized access attempts. In a RAG environment, that means logging what was retrieved, not just what was asked. The query is the input. The retrieved context is the exposure. They are not the same thing, and most logging implementations treat them as if they are. That's like a bank logging every time someone walked up to the teller but not logging what they walked out with. Great system you've got there.
Your incident response plan, the one documented under 3.6.1, has to account for model behavior, not just data breaches in the traditional sense. If a language model generates a response that inadvertently exposes CUI to an unauthorized user because the retrieval boundary wasn't properly enforced, that is a control failure. The control doesn't care that it was a model and not a person who made the mistake. The assessor won't care either. "But it was the AI" is not a finding response that ends well for anyone. Ask me how I know.
Here's the broader, somewhat uncomfortable truth. The 110 controls in NIST SP 800-171 were written with human actors in mind. That assumption is woven into the language, into the definitions, into the implied threat models. They weren't written for systems that reason, retrieve, synthesize, and respond in milliseconds without a human in the loop for any of it. An AI system is not a human actor. But it operates in human systems, touches human data, and produces outputs with real consequences. The framework is catching up. The assessors are developing guidance. The C3PAOs are building methodologies that account for AI components. But right now there's a gap, and that gap, the space between what the framework assumed and what the technology is actually doing, is prime real estate for findings, for failed assessments, and for the kind of incident that ends up in a congressional briefing nobody wanted to attend.
If you're building or deploying a RAG system that touches CUI in a CMMC Level 2 environment, the questions you need to answer before your assessment are harder than the controls themselves. Can you demonstrate that your retrieval layer enforces document-level access controls, not just system-level authentication? Can you show a complete audit trail from user query through retrieval through generation through response delivery? Can you prove that the vector store, the embedding index, the model's entire operational knowledge base, resides entirely within the authorized processing boundary and never phones home? Can you define and monitor for data minimization within the RAG context window itself?
If the answer to any of those is "we think so," that's the answer that fails an assessment. "We think so" is not a Plan of Action and Milestones. It's a prayer, and the DoD is not in the business of accepting prayers as evidence.
Governance is a moral act. I've said that before and I'll keep saying it because it keeps being true. When we deploy systems that make decisions with sensitive information, we're not just managing compliance risk. We're making a choice about accountability. We're deciding whether the boundary we draw is real or decorative. Whether the controls we document are enforced or performed for an audience.
The technology is ready. Amazon Bedrock, Azure OpenAI Service, Vertex AI, these platforms exist precisely because organizations need to do serious work with serious data inside serious boundaries. The architecture is mature enough to handle it responsibly. The question is whether the humans designing and configuring these systems are treating the chain of custody with the same seriousness the machine brings to the query.
Because the machine doesn't cut corners. It just follows instructions.
Make sure yours are the right ones.
Written by: Brad W. Beatty
Cybersecurity Rebellion - Payhip
#Cybersecurity #ArtificialIntelligence #AI #RAG #NationalSecurity #RiskManagement #Technology #CMMC #Compliance
Check out my book. Available on Amazon Now, DragonFlash: The Skipping Stones of Time
Comments ()