Beyond RAG: Your AI Can't Find Your Documents

You uploaded a PDF to ChatGPT, or maybe a group of documents that you think are important because if you group them together, you can make the AI LLM “an expert” on a specific topic. I mean that’s part of the advertising associated with ChatGPT’s “Custom GPTs”, right? And tools like Msty or LM Studio or Open WebUI also provide “easy to use”, “preconfigured” ways of handling RAG (Retrieval-Augmented Generation). The promise is that you don’t even have to use something even more complicated, like Pinecone, Weaviate, or Milvus. These tools can handle it for you automatically.

You asked a question about page 47 from one of the documents. The AI told you it doesn’t have that document.

You’re looking at the file in the sidebar. It’s right there. You uploaded it ten seconds ago. And the AI is telling you it doesn’t exist. So you try again. You rephrase. You ask something more general — not about a specific page, just about a topic you know is in the document. This time, the AI answers. Confidently. It gives you something that sounds right, but it can’t even tell you where it came from. And actually, you don’t think it got the answer right (despite its confidence). Even worse, is when you ask where in the document that answer came from, it can’t tell you. It can’t even tell you which document the answer came from!! It might even point you to a section that says something completely different. Is this really the thing everyone says is going to replace your job, tomorrow? Seriously??

If this has happened to you, you’ve probably done what most people do. You blamed the tool, or the technology. You might think ChatGPT had a bad day, or the file was too big, or the format was wrong. You probably started watching some YouTube videos on how RAG works. Maybe you tried uploading the same file to three different platforms. But the result was the same. Sometimes it works. Sometimes it doesn’t. And you can never predict which.

What I tell you next is not going to make you happy. The thing nobody tells you is that this isn’t a bug!! It’s not a glitch in ChatGPT. It’s not a problem with your file. It’s not going to get fixed in the next update. It’s how the system (RAG) was designed. The underlying system itself is broken.

What Actually Happens When You Upload a Document

When you upload a file to an AI tool, you probably assume the AI reads it. The way you’d read it — start to finish, understanding the structure, following the argument, noting what’s on page 47.

That’s not what happens.

What happens is this: your document gets chopped into fragments. Small pieces — a few hundred words each. These fragments are converted into mathematical representations called vectors, which are essentially long lists of numbers that encode meaning. The fragments are stored in a database, indexed by those numbers.

When you ask a question, your question also gets converted into a vector. The system looks for fragments whose vectors are mathematically close to your question’s vector. It grabs the closest matches — usually the top five or ten — and hands those fragments to the AI. The AI then generates a response based on those fragments.

This process has a name: RAG (Retrieval-Augmented Generation). It is how virtually every AI document tool on the market works today. ChatGPT’s file upload, Copilot’s document integration, every “upload your PDF and ask questions” product you’ve ever seen likely use some version of this under the hood.

The Reason it Breaks

Read that process again and notice what’s missing. The AI never sees your document. It sees fragments — selected not by relevance in the way you’d judge relevance, but by mathematical similarity between vectors. Those two things aren’t even remotely the same.

A paragraph about “employee screening procedures for airport security checkpoints” and a paragraph about “employee screening procedures for workplace threat assessment” will have very similar vectors. They use many of the same words. Mathematically, they look almost identical. But they are about completely different things, and pulling the wrong one changes the answer entirely.

When the system works, it’s because the right fragments happened to be mathematically close to your question. When it fails, the AI tells you it doesn’t have your document, or gives you an answer from the wrong section, or can’t tell you where its answer came from. The reason is because the math didn’t line up. The fragment you needed wasn’t in the top five. Something else was closer, numerically.

That’s not a bug you can report. It’s the architecture working as designed.

Think hard enough to let this Bother You

This is way more than a minor inconvenience.

If you are using AI to chat with a knowledge base at work — “What’s our vacation policy?” — and it gets the answer mostly right, the stakes are low. You can double-check. Nobody gets hurt.

But that same architecture is being sold to law firms for document review. It’s being sold to insurance companies for claims analysis, to compliance teams for regulatory filings, to anyone working with large document sets where accuracy matters and citations need to be traceable. Remember how we said this is the same system that can’t reliably find page 47 of a single PDF? And this system is being positioned as the solution for searching ten thousand documents where a missed connection or a fabricated citation has real consequences.

The tools are getting faster. The interfaces are getting prettier. The models are getting larger. But the underlying architecture: chopping the document into fragments, convert those fragments to vectors, and searching by mathematical similarity, and then hoping for the best has not changed.

And this is going to replace your job, tomorrow?

If you’ve had the experience I described in this article, where you upload a document but the AI can’t find it, or you get answers with no sources, or you watch the quality get worse as you add more files, now you know why. It’s not your fault. It’s not the tool’s fault, necessarily. It’s the architecture and the technology. This is the approach that every major AI platform adopted because it was fast to build and easy to demo. It works well enough for casual use, but it falls apart when the work is serious.

The question worth asking is whether the approach everyone chose is the only approach that exists.

It isn’t. But that’s the next article…

This is the first article in the Beyond RAG series — a three-part exploration of why AI document tools fail and what the alternative looks like. Next: Your AI Is Hallucinating Its Sources.