AI Hallucination: Plausible Lies and How to Handle Them

From a Lawyer’s Lawsuit to Healthcare and Journalism… Everything About AI’s Plausible Lies

Understand the fundamental causes of AI hallucination.
Review real risk cases across law, healthcare, journalism, and more.
Learn technical and human solutions including Retrieval-Augmented Generation (RAG) and critical thinking.

Part 1: The Illusion of Truth

When discussing the risks of AI hallucination, we often think only of technical flaws. However, the core issue lies in the interaction between technology and humans. I, too, was once deceived by ChatGPT’s fluent answers and trusted them unconditionally. This is where the danger begins.

Section 1: A Lawyer’s Nightmare: Mata v. Avianca Airlines Case

The story begins with veteran lawyer Steven A. Schwartz, who has over 30 years of experience. His client, Roberto Mata, filed a personal injury lawsuit against Colombian airline Avianca, a case that was disadvantageous for several reasons: lack of federal court experience, unfamiliar legal area, and critically, no subscription to premium legal databases.

This gap in expertise and resources led him to a powerful and quick alternative: ChatGPT. He later testified in court that he mistakenly assumed ChatGPT was “a kind of super search engine.” This was the prelude to tragedy. He asked AI to find precedents where the statute of limitations was tolled due to the airline’s bankruptcy.

Legal Research Using ChatGPT — Screen using ChatGPT

ChatGPT presented six plausible cases, including ‘Vargas v. China Southern Airlines.’ On the surface, they seemed perfect, but all were complete gibberish. The critical moment came when Schwartz could not find the cited cases and directly asked the AI, “Are these cases real?”

ChatGPT apologized but firmly insisted the cases existed and could be found in major databases. At this human-like conversational moment, human critical thinking completely collapsed before the machine’s persuasive persona.

Ultimately, submitting nonexistent cases cost him a $5,000 fine and an indelible stain on his reputation. The judge clarified in the ruling that the problem was not using AI itself but consciously avoiding verification and making false and misleading statements to the court.

This case shows how even experienced professionals can be vulnerable to AI hallucination under professional pressure and resource constraints. It also warns that AI’s conversational interface can be a powerful psychological tool that breaks down users’ critical defenses.

Part 2: Anatomy of Falsehood

Section 2: Why Your AI Lies: It’s Not a Bug but a Feature

The Mata v. Avianca case is not an exception. The plausible lies AI generates—called ‘hallucinations’—are not bugs but rather intrinsic features of how generative AI operates.

Large Language Models (LLMs) are not databases storing facts. Essentially, they are ’next word prediction’ engines. For example, after the phrase “Mary had a little…,” the model statistically predicts “lamb” as the most likely next word, without understanding the concept of a lamb.

Next Word Prediction Principle of Large Language Models

This explains why AI can produce perfectly formatted legal citations or references. The model is a ‘master of form’ that learns patterns of format rather than content substance. Added to this is the principle “Garbage In, Garbage Out.” AI learns from internet data mixed with truth and falsehood without an inherent mechanism to distinguish them.

Ultimately, hallucination is an unavoidable trade-off between creativity and accuracy. Completely removing this ‘feature’ could paralyze the model’s core generative ability. Therefore, the solution is not ‘bug fixing’ but effectively ‘managing’ this characteristic.

Section 3: Echoes in the System: AI Hallucination Across High-Risk Industries

AI hallucination is not confined to the legal field. It poses serious threats in other high-risk industries where accuracy is critical.

Journalism’s Failed Experiment: The CNET Scandal

Tech news outlet CNET published financial articles generated by AI but was riddled with “absurd errors” such as incorrect compound interest calculations and plagiarism. Ultimately, CNET issued corrections for over half (41 out of 77) of the AI-generated articles.

Dangerous Prescriptions in Healthcare

In healthcare, AI hallucination can be a matter of life and death. One study found ChatGPT citing nonexistent scientific papers and describing fabricated biochemical pathways. There were even reports of AI advising users to eat rocks or produce toxic gases—dangerous cases lacking common sense.

AI Use and Potential Risks in Healthcare

Crisis of Trust in Academia

Academia is also suffering from scientific record contamination due to AI-generated fake citations. Research shows AI models can fabricate up to 69% of citations.

Types and Consequences of AI Hallucination by Industry

Industry	Hallucination Type	Real-World Consequences
Legal	Fabricated legal precedents and cases	Court sanctions, professional discipline, loss of credibility
Journalism	Errors in financial facts, plagiarism	Misinformation, loss of media trust, mass article corrections
Healthcare	Fabricated biochemical pathways, fake medical references, dangerous health advice	Misdiagnosis risk, inappropriate treatment, direct harm to patients
Academia	Nonexistent scholarly materials and citations	Contamination of scientific records, erosion of research trust, peer review failures

Part 3: The Path to Truth

Section 4: Correcting Fiction into Fact: Technical Safeguards

Various technical safeguards are being developed to address AI hallucination.

“Open-Book Exam”: Retrieval-Augmented Generation (RAG)

One of the most promising solutions is Retrieval-Augmented Generation (RAG). Instead of relying solely on the LLM’s internal memory—a ‘closed-book exam’—RAG enables the model to consult reliable external sources, like an ‘open-book exam.’

When a user asks a question, the RAG system first retrieves relevant information from an external knowledge base, then augments the query with this information before passing it to the LLM. This grounds the LLM’s answers in verifiable, up-to-date facts, dramatically reducing hallucination.

How Retrieval-Augmented Generation (RAG) Works

Automated Fact-Checking Systems

Another approach decomposes AI outputs into verifiable claims and cross-checks them against external data using automated fact-checking systems.

However, technical solutions alone are insufficient. Studies show even high-accuracy fact-checkers do not significantly improve user discernment and can sometimes cause harm. Technology can bring facts to us, but integrating that information correctly into human belief systems is not guaranteed. Therefore, human-in-the-loop participation is essential for these systems to work properly.

Section 5: The User’s Move: From Prompting to Critical Thinking

The most powerful tool to reduce AI hallucination is not the algorithm but the user’s own critical thinking. How do you use AI?

Prompt Engineering: Designing for Truth

Strategic prompts can steer AI responses closer to truth.

Source-based prompts: Specify trusted sources, e.g., “Answer the question based on the following text.”
Chain-of-Verification prompts (CoVe): Require AI to verify reasoning steps before the final answer.
Reflective prompts: After generating an answer, ask AI to “step back and review the accuracy of your response” to encourage self-correction.
Citation demands: Explicitly request verifiable sources for all claims as a basic safeguard.

Human Firewall: The Last Line of Defense

Ultimately, the most effective defense against hallucination is human intervention.

Embrace skepticism: Treat all AI outputs as drafts requiring verification, not final answers.
Verification duty: Steven Schwartz’s critical mistake was not using AI but failing to independently verify its results. Final responsibility always lies with the human user.
Critical thinking as a core skill: In the AI era, critical thinking and source evaluation are essential professional competencies.

Human Critical Thinking More Important in the AI Era

Now, AI users must shift from mere ‘operators’ issuing commands to ‘auditors’ who investigate and verify outputs. We must learn not only how to use AI but also how to audit AI.

AI Model Comparison: Standard LLM vs. RAG System

Feature	Standard LLM (Base ChatGPT)	RAG-based LLM
Information Source	Relies only on trained internal data	External up-to-date knowledge base + internal data
Accuracy	High risk of AI hallucination	Fact-based responses greatly reduce hallucination
Recency	Cannot reflect information after training	Can incorporate real-time latest information
Transparency	Difficult to provide source references	Can clearly cite information sources
Drawbacks	May generate inaccurate or outdated info	Complex initial setup and knowledge base management

Checklist: 5-Step User Guide to Prevent AI Hallucination

A practical guide for safer AI use.

Clarify your goal: Ask AI for creative tasks like idea generation or drafting, not just fact-checking.
Use source-based prompts: Specify the basis for answers, e.g., “Answer based on the provided [document]” or “Cite information from authoritative websites.”
Maintain skepticism: Treat AI answers as hypotheses needing review, especially statistics, citations, and expert info.
Cross-verify: Independently confirm key details (names, dates, cases, papers) from reliable external sources (Google, professional databases).
Final responsibility lies with you: Remember AI is a powerful assistant, but you bear ultimate responsibility for accuracy and ethics.

Conclusion

Steven Schwartz’s story is a powerful warning about what happens when we delegate critical judgment to machines. Exploring the labyrinth of AI hallucination, we must remember three core points:

AI hallucination is a feature, not a bug: As a next-word prediction model, AI inherently produces statistically plausible falsehoods.
The risks are real and widespread: In law, healthcare, journalism, and other high-risk fields, hallucinations can cause serious financial, social, and even physical harm.
Solutions lie in human-technology collaboration: Combining technical safeguards like RAG with users’ critical thinking and verification—the human firewall—enables safe AI use.

Our goal is not AI that replaces human thought but AI that augments it. Rather than fearing the ghost in the machine, we must understand and control its nature, making it a powerful ally for human intelligence. Check your AI usage habits now and evolve from an ‘operator’ to a wise ‘auditor.’

References

What Happened to the Lawyer Who Used ChatGPT? Lessons to Learn Spellbook
Issues beyond ChatGPT use were at play in fake cases scandal Legal Dive
MATA v. AVIANCA INC (2023) FindLaw Caselaw
Fake Cases, Real Consequences: Misuse of ChatGPT Leads to Sanctions Goldberg Segalla
Lawyers who ‘doubled down’ and defended ChatGPT’s fake cases must pay $5K, judge says ABA Journal
AI Hallucinations Explained: Why It’s Not a Bug but a Feature Endjin
The Surprising Power of Next Word Prediction: Large Language Models Explained, Part 1 CSET
The Fabrication Problem: How AI Models Generate Fake Citations, URLs, and References Medium
Artificial Hallucinations in ChatGPT: Implications in Scientific Writing PMC
Incident 455: CNET’s Published AI-Written Articles Ran into Quality and Accuracy Issues AI Incident Database
What is RAG (Retrieval Augmented Generation)? IBM
Fact-checking information from large language models can decrease headline discernment PNAS