Photo by julien Tromeur on Unsplash

Last month, a lawyer in New York submitted legal briefs citing cases that don't exist. He wasn't senile or careless—he'd simply trusted ChatGPT to cite precedents. The AI had fabricated entire court decisions with perfect confidence, complete with fake case numbers and judges' names. This wasn't an isolated incident. It's becoming the defining problem of modern AI systems, and nobody really knows how to fix it.

We call it "hallucination." The term is almost cute, which feels wrong given the stakes. When an AI hallucinates, it generates information that sounds entirely plausible but is completely false. It doesn't say "I don't know." It confidently invents. And here's the terrifying part: the most powerful models are hallucinating more, not less.

The Confidence Problem Nobody Expected

When GPT-4 was released, people celebrated its intelligence. And it is genuinely impressive. But researchers quickly noticed something unsettling: the system would make up biographical details about real people, invent book chapters that don't exist, and create scientific studies with authentic-looking methodologies that never happened.

A researcher at Stanford asked GPT-4 to generate examples of bias in AI training data. The model produced three realistic-looking examples. Two were completely fabricated. When asked why, the system couldn't explain the difference between the real example and the invented ones. It had no internal mechanism to distinguish between what it learned during training and what it had statistically generated.

The root cause is uncomfortable: large language models don't actually "know" anything. They're sophisticated pattern-matching machines that predict the next word based on billions of parameters. They've learned that when someone asks a question, a confident answer is usually appropriate. So they provide one, even when they should say nothing.

Think of it like this. You're at a dinner party and someone asks you about a specific historical event you're not sure about. You have three instincts: admit you don't know, change the subject, or confidently make something up. Language models have been trained on internet text where option three happens constantly. Why? Because humans filled the internet with confident assertions about things we don't actually know. The AI learned our worst habit.

Why Bigger Models Make Bigger Problems

Here's where it gets worse. Counterintuitively, larger models hallucinate more, not less. A 7-billion-parameter model might refuse to answer uncertain questions. A 70-billion-parameter model will answer them with bells on, complete with citations it invented.

It seems backward until you understand why. Larger models have more capacity to learn intricate patterns. They're better at generating text that feels authoritative because they've learned more about how authoritative text is structured. They've absorbed more internet confidence. They can construct fake citations that follow proper formatting conventions. They sound more credible while being less truthful.

A study from Microsoft Research found that scaling up models actually reduced their ability to say "I don't know" at appropriate moments. Researchers tested increasingly large versions of the same base model and measured how often each would refuse to answer uncertain questions. The smallest model refused frequently. The largest model almost never refused, even when it should have.

This creates a perverse incentive structure in AI development. Companies measure performance on benchmarks where bigger models score higher. Nobody's measuring how often they hallucinate because it's harder to quantify. So teams build bigger, more confident, more dangerous systems.

The Upstream Poisoning Problem

There's another layer to this crisis. As AI systems become more prevalent, they're starting to train on text generated by other AI systems. ChatGPT creates content. That content gets indexed by Google. Someone includes it in a dataset to train another model. Now hallucinations are reproducing and mutating like a virus through the information ecosystem.

Researchers at Google discovered that training on content generated by language models actually degrades performance of subsequent models. The errors compound. False information gets reinforced across generations.

We're building a system where the internet becomes increasingly polluted with AI-generated plausibility, and newer AI systems learn from that pollution. It's a feedback loop where confidence and falsehood reinforce each other. Related to this broader trust problem in AI, exploring why chatbots make us uncomfortable when they try too hard to care reveals how AI's attempt to seem helpful and honest often backfires.

What Actually Reduces Hallucination (Spoiler: It's Boring)

Some approaches are starting to show promise, but they're not glamorous. Retrieval-augmented generation (RAG) systems retrieve actual documents before generating responses. Instead of purely generating from learned patterns, they ground answers in real information. It works, but it's slower and more expensive.

Fine-tuning on high-quality data helps. Models trained primarily on peer-reviewed papers and vetted sources hallucinate less than models trained on raw internet text. But this requires curating training data, which doesn't scale as easily as just downloading everything from the web.

Some research suggests that certain prompting techniques reduce hallucination. Asking models to "think step by step" or to "explain your reasoning" sometimes makes them more cautious. But this works inconsistently, and it doesn't solve the fundamental problem: the model still doesn't know whether it's making things up.

The honest truth is that we don't have a silver bullet. We have partial solutions that come with tradeoffs. And we have a situation where the economically incentivized path—building bigger, faster models—is moving opposite the direction we need to go.

The Coming Reckoning

As AI systems become embedded in critical infrastructure—legal research, medical diagnosis, financial advice—hallucination stops being a funny quirk. It becomes a liability and a danger.

Some organizations are responding by refusing to use AI for high-stakes decisions without human verification. But that defeats the supposed efficiency advantage. Others are doubling down, building even larger systems and hoping the sheer scale will somehow solve the problem. It won't.

The real question isn't whether we can eliminate hallucination. We probably can't, at least not with current approaches. The question is whether we'll be honest about the limitations and build tools accordingly, or whether we'll continue marketing confident hallucination machines as reliable sources of truth. Given the incentives at play, I'm not optimistic.