posts / Science

AI Hallucination: Plausible Lies and How to Handle Them

phoue

8 min read --

From a Lawyer’s Lawsuit to Healthcare and Journalism… Everything About AI’s Plausible Lies

  • Understand the fundamental causes of AI hallucination.
  • Review real risk cases across law, healthcare, journalism, and more.
  • Learn technical and human solutions including Retrieval-Augmented Generation (RAG) and critical thinking.

Part 1: The Illusion of Truth

When discussing the risks of AI hallucination, we often think only of technical flaws. However, the core issue lies in the interaction between technology and humans. I, too, was once deceived by ChatGPT’s fluent answers and trusted them unconditionally. This is where the danger begins.

Section 1: A Lawyer’s Nightmare: Mata v. Avianca Airlines Case

The story begins with veteran lawyer Steven A. Schwartz, who has over 30 years of experience. His client, Roberto Mata, filed a personal injury lawsuit against Colombian airline Avianca, a case that was disadvantageous for several reasons: lack of federal court experience, unfamiliar legal area, and critically, no subscription to premium legal databases.

This gap in expertise and resources led him to a powerful and quick alternative: ChatGPT. He later testified in court that he mistakenly assumed ChatGPT was “a kind of super search engine.” This was the prelude to tragedy. He asked AI to find precedents where the statute of limitations was tolled due to the airline’s bankruptcy.

Legal Research Using ChatGPT
Screen using ChatGPT

ChatGPT presented six plausible cases, including ‘Vargas v. China Southern Airlines.’ On the surface, they seemed perfect, but all were complete gibberish. The critical moment came when Schwartz could not find the cited cases and directly asked the AI, “Are these cases real?”

ChatGPT apologized but firmly insisted the cases existed and could be found in major databases. At this human-like conversational moment, human critical thinking completely collapsed before the machine’s persuasive persona.

Ultimately, submitting nonexistent cases cost him a $5,000 fine and an indelible stain on his reputation. The judge clarified in the ruling that the problem was not using AI itself but consciously avoiding verification and making false and misleading statements to the court.

This case shows how even experienced professionals can be vulnerable to AI hallucination under professional pressure and resource constraints. It also warns that AI’s conversational interface can be a powerful psychological tool that breaks down users’ critical defenses.

Part 2: Anatomy of Falsehood

Section 2: Why Your AI Lies: It’s Not a Bug but a Feature

The Mata v. Avianca case is not an exception. The plausible lies AI generates—called ‘hallucinations’—are not bugs but rather intrinsic features of how generative AI operates.

Large Language Models (LLMs) are not databases storing facts. Essentially, they are ’next word prediction’ engines. For example, after the phrase “Mary had a little…,” the model statistically predicts “lamb” as the most likely next word, without understanding the concept of a lamb.

Advertisement

Next Word Prediction Principle of Large Language Models
Next Word Prediction Principle of Large Language Models

This explains why AI can produce perfectly formatted legal citations or references. The model is a ‘master of form’ that learns patterns of format rather than content substance. Added to this is the principle “Garbage In, Garbage Out.” AI learns from internet data mixed with truth and falsehood without an inherent mechanism to distinguish them.

Ultimately, hallucination is an unavoidable trade-off between creativity and accuracy. Completely removing this ‘feature’ could paralyze the model’s core generative ability. Therefore, the solution is not ‘bug fixing’ but effectively ‘managing’ this characteristic.

Section 3: Echoes in the System: AI Hallucination Across High-Risk Industries

AI hallucination is not confined to the legal field. It poses serious threats in other high-risk industries where accuracy is critical.

Journalism’s Failed Experiment: The CNET Scandal

Tech news outlet CNET published financial articles generated by AI but was riddled with “absurd errors” such as incorrect compound interest calculations and plagiarism. Ultimately, CNET issued corrections for over half (41 out of 77) of the AI-generated articles.

Dangerous Prescriptions in Healthcare

In healthcare, AI hallucination can be a matter of life and death. One study found ChatGPT citing nonexistent scientific papers and describing fabricated biochemical pathways. There were even reports of AI advising users to eat rocks or produce toxic gases—dangerous cases lacking common sense.

AI Use and Potential Risks in Healthcare
AI Use and Potential Risks in Healthcare

Crisis of Trust in Academia

Academia is also suffering from scientific record contamination due to AI-generated fake citations. Research shows AI models can fabricate up to 69% of citations.

Types and Consequences of AI Hallucination by Industry

IndustryHallucination TypeReal-World Consequences
LegalFabricated legal precedents and casesCourt sanctions, professional discipline, loss of credibility
JournalismErrors in financial facts, plagiarismMisinformation, loss of media trust, mass article corrections
HealthcareFabricated biochemical pathways, fake medical references, dangerous health adviceMisdiagnosis risk, inappropriate treatment, direct harm to patients
AcademiaNonexistent scholarly materials and citationsContamination of scientific records, erosion of research trust, peer review failures

Part 3: The Path to Truth

Section 4: Correcting Fiction into Fact: Technical Safeguards

Various technical safeguards are being developed to address AI hallucination.

“Open-Book Exam”: Retrieval-Augmented Generation (RAG)

One of the most promising solutions is Retrieval-Augmented Generation (RAG). Instead of relying solely on the LLM’s internal memory—a ‘closed-book exam’—RAG enables the model to consult reliable external sources, like an ‘open-book exam.’

Advertisement

When a user asks a question, the RAG system first retrieves relevant information from an external knowledge base, then augments the query with this information before passing it to the LLM. This grounds the LLM’s answers in verifiable, up-to-date facts, dramatically reducing hallucination.

How Retrieval-Augmented Generation (RAG) Works
How Retrieval-Augmented Generation (RAG) Works

Automated Fact-Checking Systems

Another approach decomposes AI outputs into verifiable claims and cross-checks them against external data using automated fact-checking systems.

However, technical solutions alone are insufficient. Studies show even high-accuracy fact-checkers do not significantly improve user discernment and can sometimes cause harm. Technology can bring facts to us, but integrating that information correctly into human belief systems is not guaranteed. Therefore, human-in-the-loop participation is essential for these systems to work properly.

Section 5: The User’s Move: From Prompting to Critical Thinking

The most powerful tool to reduce AI hallucination is not the algorithm but the user’s own critical thinking. How do you use AI?

Prompt Engineering: Designing for Truth

Strategic prompts can steer AI responses closer to truth.

  • Source-based prompts: Specify trusted sources, e.g., “Answer the question based on the following text.”
  • Chain-of-Verification prompts (CoVe): Require AI to verify reasoning steps before the final answer.
  • Reflective prompts: After generating an answer, ask AI to “step back and review the accuracy of your response” to encourage self-correction.
  • Citation demands: Explicitly request verifiable sources for all claims as a basic safeguard.

Human Firewall: The Last Line of Defense

Ultimately, the most effective defense against hallucination is human intervention.

  • Embrace skepticism: Treat all AI outputs as drafts requiring verification, not final answers.
  • Verification duty: Steven Schwartz’s critical mistake was not using AI but failing to independently verify its results. Final responsibility always lies with the human user.
  • Critical thinking as a core skill: In the AI era, critical thinking and source evaluation are essential professional competencies.

Human Critical Thinking More Important in the AI Era
Human Critical Thinking More Important in the AI Era

Now, AI users must shift from mere ‘operators’ issuing commands to ‘auditors’ who investigate and verify outputs. We must learn not only how to use AI but also how to audit AI.


AI Model Comparison: Standard LLM vs. RAG System

FeatureStandard LLM (Base ChatGPT)RAG-based LLM
Information SourceRelies only on trained internal dataExternal up-to-date knowledge base + internal data
AccuracyHigh risk of AI hallucinationFact-based responses greatly reduce hallucination
RecencyCannot reflect information after trainingCan incorporate real-time latest information
TransparencyDifficult to provide source referencesCan clearly cite information sources
DrawbacksMay generate inaccurate or outdated infoComplex initial setup and knowledge base management

Checklist: 5-Step User Guide to Prevent AI Hallucination

A practical guide for safer AI use.

Advertisement

  1. Clarify your goal: Ask AI for creative tasks like idea generation or drafting, not just fact-checking.
  2. Use source-based prompts: Specify the basis for answers, e.g., “Answer based on the provided [document]” or “Cite information from authoritative websites.”
  3. Maintain skepticism: Treat AI answers as hypotheses needing review, especially statistics, citations, and expert info.
  4. Cross-verify: Independently confirm key details (names, dates, cases, papers) from reliable external sources (Google, professional databases).
  5. Final responsibility lies with you: Remember AI is a powerful assistant, but you bear ultimate responsibility for accuracy and ethics.

Conclusion

Steven Schwartz’s story is a powerful warning about what happens when we delegate critical judgment to machines. Exploring the labyrinth of AI hallucination, we must remember three core points:

  • AI hallucination is a feature, not a bug: As a next-word prediction model, AI inherently produces statistically plausible falsehoods.
  • The risks are real and widespread: In law, healthcare, journalism, and other high-risk fields, hallucinations can cause serious financial, social, and even physical harm.
  • Solutions lie in human-technology collaboration: Combining technical safeguards like RAG with users’ critical thinking and verification—the human firewall—enables safe AI use.

Our goal is not AI that replaces human thought but AI that augments it. Rather than fearing the ghost in the machine, we must understand and control its nature, making it a powerful ally for human intelligence. Check your AI usage habits now and evolve from an ‘operator’ to a wise ‘auditor.’

References
  • What Happened to the Lawyer Who Used ChatGPT? Lessons to Learn Spellbook
  • Issues beyond ChatGPT use were at play in fake cases scandal Legal Dive
  • MATA v. AVIANCA INC (2023) FindLaw Caselaw
  • Fake Cases, Real Consequences: Misuse of ChatGPT Leads to Sanctions Goldberg Segalla
  • Lawyers who ‘doubled down’ and defended ChatGPT’s fake cases must pay $5K, judge says ABA Journal
  • AI Hallucinations Explained: Why It’s Not a Bug but a Feature Endjin
  • The Surprising Power of Next Word Prediction: Large Language Models Explained, Part 1 CSET
  • The Fabrication Problem: How AI Models Generate Fake Citations, URLs, and References Medium
  • Artificial Hallucinations in ChatGPT: Implications in Scientific Writing PMC
  • Incident 455: CNET’s Published AI-Written Articles Ran into Quality and Accuracy Issues AI Incident Database
  • What is RAG (Retrieval Augmented Generation)? IBM
  • Fact-checking information from large language models can decrease headline discernment PNAS
#ai hallucination#generative ai#chatgpt error#fact checking#prompt engineering

Recommended for You

The 'Glass Substrate War' with Intel, Samsung, and SK: Who Will Win the Future of Semiconductors?

The 'Glass Substrate War' with Intel, Samsung, and SK: Who Will Win the Future of Semiconductors?

6 min read --
DeepSeek: An Innovator or a Surveillance Tool in Disguise?

DeepSeek: An Innovator or a Surveillance Tool in Disguise?

6 min read --
The Origin of Petroleum: The Myth of Dinosaur Tears and the Scientific Truth

The Origin of Petroleum: The Myth of Dinosaur Tears and the Scientific Truth

5 min read --

Advertisement

Comments