Photo by Immo Wegmann on Unsplash

Last week, I asked ChatGPT to tell me about the inventor of the rubber band. It confidently explained that it was invented by Stephen Perry in 1845 and described his British factory with surprising specificity. The problem? I already knew the answer was correct, but the model had no way of knowing that. It could have equally generated a completely fabricated biography with the exact same confidence.

This is the central paradox of modern large language models: they're exceptionally good at producing text that sounds right, whether or not it actually is. They've essentially learned to generate statistically plausible responses, which sometimes align with reality and sometimes don't—and the model itself has no mechanism for distinguishing between the two.

The Illusion of Understanding

When we interact with AI systems, we're not communicating with something that understands the world. We're using a sophisticated pattern-matching tool that has been trained on billions of examples of human text. The model learns statistical correlations—that certain words tend to follow other words, that certain facts cluster together, that confidence-projecting language works well in training data.

What it doesn't learn is truth. It doesn't learn causality. It doesn't learn verification. A language model can't check Wikipedia, can't run an experiment, can't call an expert to verify information. It can only shuffle around patterns from its training data, weighted by what seemed to "work" during the training process.

The danger isn't that AI gives obviously wrong answers. The danger is that it gives confidently delivered answers that *sound* right. Consider a concrete example: ask a current language model about a specific court case from 2022. It will give you a detailed explanation with case names, judges, and outcomes. Some of this might be real. Some might be invented. The statistical probability that everything is accurate? Lower than you'd want when you're actually relying on that information for anything important.

Why Confidence Is the Real Problem

Humans are naturally skeptical of uncertain statements. "I think maybe the capital of Ohio is Columbus" doesn't convince anyone. But "The capital of Ohio is Columbus. It was established in 1812 as a planned city and was named after Christopher Columbus." That sells.

AI systems excel at the second format. They produce crisp, detailed, confident-sounding prose. There's no "um" or "I think" or "I'm not entirely sure about this part." The training process actually rewards this kind of confident output. When you show a language model pairs of human-written text and human feedback, the feedback tends to favor the more authoritative-sounding response.

A 2023 study by researchers at MIT and several other institutions examined this phenomenon directly. They found that people rated AI-generated text as more factually accurate when it was written with high confidence—even when that confidence was entirely unwarranted. The effect was strong enough that people could be convinced of false information if it was presented smoothly enough.

This creates a peculiar trap. The better AI systems are at sounding authoritative, the more dangerous they become. You could argue that a system that stammered out "I'm not sure, but maybe it's something like this..." would be safer, despite being technically less sophisticated in some ways.

The Problem With Scale

Here's what keeps AI researchers awake at night: making models bigger, training them on more data, and giving them more computational resources generally makes this problem worse, not better. Larger models are better at pattern-matching, which means they're better at generating plausible-sounding text. They're also proportionally better at bullshitting without anyone noticing.

A small language model might fail obviously—it might produce incoherent text that alerts you something is wrong. A large language model produces fluent, detailed, entirely wrong text that feels authoritative. The user experience is better. The actual reliability might be worse.

Some companies have tried to address this by adding what they call "factuality filters" or by implementing systems where the AI is supposed to say "I don't know" more often. But this is surprisingly difficult to implement consistently. You can fine-tune a model to be more cautious, but you often lose other capabilities in the process. It's a genuine engineering challenge, not just something where we haven't bothered yet.

What This Means for Actual Usage

The practical takeaway is straightforward: AI systems are remarkable tools for certain tasks, but they should never be your only source of information for anything that matters. They're great for brainstorming, for getting explanations on topics you already understand, for catching writer's block, for generating code (which you then test).

They're terrible for fact-checking, for learning new subjects without independent verification, or for anything where confident wrongness is worse than admitting uncertainty. This isn't a limitation that will definitely be solved in the next few years. It might be fundamental to how these systems work.

If you want to understand this problem more deeply, check out our piece on why AI assistants are confident liars—it explores the psychology behind why these systems fall into this pattern so reliably.

The future of AI isn't just about making it smarter. It's about making it more honest. And that's a much harder problem to solve.