Photo by Igor Omilaev on Unsplash
Last week, I asked ChatGPT to explain how photosynthesis works. It gave me a beautifully written explanation that was, by the end of the second paragraph, completely nonsensical. The sentences flowed. The vocabulary was precise. But the science was fiction. This wasn't a glitch or a model hallucinating wildly—it was doing exactly what it was trained to do: predict the next word that sounds reasonable, regardless of whether it's true.
This phenomenon has become one of the most pressing issues in AI development, and it reveals something deeply troubling about how we've built these systems. We've created machines that are phenomenal at mimicking human-like communication while being fundamentally incapable of actually understanding what they're saying.
The Confidence Problem That Nobody Wants to Talk About
Here's the unsettling part: AI models don't have a built-in mechanism for saying "I don't know." They have learned to generate text that seems authoritative because authority tends to get rewarded during training. When you correct a model, it doesn't learn from the correction the way a human would. It simply adjusts its probabilities for future similar scenarios.
Anthropic's research team recently published findings showing that larger language models are actually worse at admitting uncertainty than smaller ones. A 70-billion parameter model will confidently assert false information more often than a 7-billion parameter model. Why? Because as models get larger and train on more diverse data, they become better at pattern-matching plausible-sounding responses, regardless of accuracy.
Think about the implications. We're using these systems in medical contexts, legal research, financial advising, and education. A radiologist using AI to assist with diagnosis might miss a problem if the model confidently claims confidence in a misdiagnosis. A lawyer relying on an AI system for precedent research might cite cases that the model completely fabricated—and these "hallucinated" cases sound absolutely real.
The really maddening part? The AI isn't lying on purpose. It's not being deceptive. It's just doing what it was optimized to do: produce the next token that maximizes likelihood based on patterns. There's no inner voice saying "I'm not sure about this." There's just mathematics generating text.
Why Fluency and Accuracy Are Becoming Enemies
Modern AI training uses something called RLHF—Reinforcement Learning from Human Feedback. Humans rate model outputs as good or bad, and the model learns to optimize for highly-rated responses. This works brilliantly for making AI sound natural and helpful. It works terribly for making AI accurate.
Here's why: a well-written wrong answer scores higher than an uncertain correct answer. "Due to the atmospheric composition of Venus consisting primarily of carbon dioxide, the planet's surface temperature reaches approximately 900 degrees Fahrenheit, which is why we've never been able to explore it" sounds infinitely better than "Venus is very hot and we have limited surface data about it." One is punchy and specific. The other sounds like a cop-out.
We've essentially trained our AI systems to be confident bullshitters. And we did this deliberately. We optimized for user satisfaction, which correlates with authoritative-sounding responses.
Some researchers have started experimenting with different training approaches. OpenAI's newer models show slight improvements in admitting when they don't have reliable information. But these improvements come at a cost—the models become less fluent, less engaging, and crucially, users report lower satisfaction with them.
The Broken Feedback Loop We're Stuck In
Here's where this gets genuinely concerning: we're caught in a feedback loop that nobody seems willing to break. Users prefer fluent, confident AI. Developers optimize for user satisfaction. This creates more fluent, confident AI that sounds right more often than it actually is right. Users then rely on it more. Stakes increase. Problems compound.
Right now, we're mostly safe because people still treat AI as a suggestion engine. "Let me check that with Google" or "Let me verify that with a human expert" is still the norm. But as these systems get integrated deeper into decision-making processes, as they become the first place we go rather than the second opinion, this problem becomes a crisis.
A 2023 study from Stanford's Human-Centered Artificial Intelligence lab found that when AI systems present information with high confidence, users are significantly more likely to trust that information without verification, even when explicitly warned that the AI might be wrong. The warnings don't stick. The confident tone overrides them.
If you want a deeper exploration of how these systems fail in unexpected ways, check out our piece on how AI models get tricked by a single typo and what that reveals about intelligence—it's a related issue that shows just how fragile these systems actually are.
What We Should Actually Be Doing
Some organizations are starting to push back against the confidence problem. They're training models to output uncertainty estimates. Instead of just providing an answer, the AI also provides a confidence score. "This is true, but I'm only 65% confident in this response." It's not perfect, but it's a start.
Others are building in explicit knowledge cutoffs and limitations. A model trained on data only through April 2024 will say so. A model that hasn't been fine-tuned on legal data won't pretend to be a lawyer.
But the most important change might be cultural rather than technical. We need to stop celebrating AI for sounding human and start valuing it for being useful accurately. That might mean less fluent AI. It definitely means more frustrated users sometimes. But it means fewer disasters down the road.
The fortune cookie thing I mentioned at the start? That's actually not the worst outcome. The worst outcome is when the broken AI sounds so good that nobody even realizes it's broken until damage is done.

Comments (0)
No comments yet. Be the first to share your thoughts!
Sign in to join the conversation.