Photo by Mohamed Nohassi on Unsplash
Last month, a lawyer submitted a legal brief to a federal judge that cited six cases. The problem? Four of them didn't exist. They were fabricated wholesale by ChatGPT. The lawyer didn't know—he trusted the AI to do its job. This wasn't an edge case or a weird bug. It was a feature of how these systems work, and it's becoming a bigger problem as AI models get larger and more confident.
We call this "hallucination," which is a polite word for what's really happening: artificial intelligence systems are making stuff up with absolute certainty. They're not hedging their bets or saying "I don't know." They're inventing information and presenting it with the same confidence they use when discussing actual facts.
The Confidence Problem Nobody Predicted
When engineers first started training large language models back in 2018-2019, hallucinations were seen as a minor training issue. They'd disappear as you scaled up the models, made them smarter, gave them more data. It made sense: bigger brain, fewer mistakes, right?
Wrong.
The opposite happened. As models got larger—GPT-3 had 175 billion parameters, GPT-4 has somewhere around 1.7 trillion—they got better at sounding authoritative while being completely wrong. They learned to construct sentences that follow all the grammatical rules, match the style of the source material they were trained on, and link ideas together in logical-sounding ways. None of that requires the information to be true.
A study from Stanford researchers in 2023 found that GPT-4, despite being "smarter" than GPT-3.5 in most benchmarks, actually hallucinated more frequently in certain domains. The newer model had simply gotten better at making fake information sound authentic. It's like training a criminal—give them more experience, and they get better at not getting caught.
The core issue is architectural. These models work by predicting the next most likely word based on patterns in their training data. They don't have access to a fact-checking system. They don't look anything up. They're not reasoning through problems the way humans do. They're pattern-matching machines that have gotten disturbingly good at patterns that sound like reasoning.
Why Bigger Data Doesn't Fix This
You'd think training on more accurate data would help. You'd be wrong again. Companies have tried—feeding models Wikipedia articles, academic papers, news archives. The problem is that the internet contains contradictions, falsehoods, and disputed claims. Sometimes there are multiple "true" versions of events. Sometimes the best-sounding explanation isn't the correct one.
When you feed a pattern-matching machine contradictory information, it doesn't resolve the contradiction the way a human would by evaluating evidence. It learns all the patterns. Then when you ask it a question, it can generate text matching any of those patterns. The one it picks is based on statistical probability, not accuracy.
There's also a perverse incentive at play here: models that generate longer, more detailed answers tend to score higher on common evaluation benchmarks, even when parts of those answers are fabricated. A model that says "I don't know" scores lower than one that confidently (and incorrectly) answers the question. Researchers call this the "bullshitting problem."
The Actual Solutions Are Messy and Unglamorous
Some companies are trying to bolt on fact-checking systems. You've probably seen this if you use ChatGPT or Claude—they sometimes add citations or claim to be searching the web. These approaches help but don't solve the core problem. They slow down response time, require external data sources, and still rely on the model to know when to use them.
Anthropic, the company behind Claude, has been experimenting with what they call "constitutional AI." Instead of just training models to be helpful, they build in explicit values and then fine-tune the model to follow them. It's helped, but it's not a silver bullet. The model still hallucinates—it's just slightly more aware that it shouldn't.
Other teams are building in uncertainty quantification—trying to get models to express how confident they are in their answers. This is genuinely hard because these systems don't have a natural way to express doubt. They're predicting probabilities across billions of possible outputs. Making them communicate those probabilities in human-readable ways is still an unsolved problem.
The most practical approach right now? Don't let AI systems be the sole source of truth for anything important. If you're using these models for customer service, fact-checking the outputs isn't optional—it's necessary. The memory limitations of current chatbots compound this problem, making it even harder for them to maintain consistency with facts they should know.
Where This Is Heading
The honest answer? Nobody knows. We don't have a clear path to solving this that doesn't involve fundamentally rethinking how these models work. Current approaches are incremental—marginally better prompting, slightly improved training, external fact-checking patches.
Some researchers think the solution requires building models that can actually search and retrieve information rather than relying purely on pattern-matching over training data. Others think we need completely different architectures that separate fact storage from reasoning. A few think we're hitting a fundamental limit and that scaling up language models further is the wrong approach entirely.
What's clear is this: the lawyer who submitted those fake cases to a federal judge wasn't the victim of a malfunction. He was experiencing the system working as designed. And until we solve the confidence problem—getting these models to know what they don't know—we're going to keep seeing more incidents like that one.
The good news? People are taking this seriously now. Companies are investing in solutions. Researchers are publishing papers on the problem constantly. The bad news? We're still in the early experimental phase, and there's no consensus on what the right answer looks like.
For now, if an AI system tells you something important, verify it. Not because the model is broken—but because working as intended, it's still perfectly capable of lying with absolute certainty.

Comments (0)
No comments yet. Be the first to share your thoughts!
Sign in to join the conversation.