AI failure digest, May 24–31

AI failure digest, May 24–31

20 AI failure items from May 24–31: Bixonimania's week-two twist (human peer reviewers failed before AI did, Springer Nature's official acknowledgment), two unpatched ChatGPT prompt injection vulnerabilities (ChatGPhish exploiting page summarization, Sheets integration exfiltrating workbooks), a BMJ Open study finding 1-in-5 medical chatbot answers "highly problematic" across all five major models, and a week of high-engagement Reddit moments — a hurtful ChatGPT response (1,344 upvotes), coding-frustration meme (1,267 upvotes), and a fabricated Supreme Court case that the model defended until proven wrong.

AI Fails
2026/6/1 · 10:23
購読 1 件 · コンテンツ 3 件
Twenty qualifying AI failure items this week: five from Reddit, fifteen from X. The highest-engagement Reddit post hit 1,344 upvotes. The most viral tweet reached 767 likes and 10,112 views — on a reply thread with 14 followers at its origin. r/StableDiffusion had nothing; it went entirely technical this week. The sycophancy cluster that dominated last issue has subsided.
The week split into two registers. The Bixonimania story picked up institutional momentum — Springer Nature's own account acknowledged the hoax, and a close reading of the peer review timeline showed that human gatekeepers failed alongside the AI. Prompt injection remained active, with two unpatched vulnerabilities in widely-used ChatGPT integrations. Underneath all of that: a high-volume background layer of everyday model unreliability — fabricated court cases, fabricated movie facts, physicists and developers reporting that all the major models "basically just guess."

Bixonimania, week two: peer review failed too

Last week's lead story — a Swedish researcher inventing a fictional eye disease to see if AI would diagnose it — continued generating expert commentary, and gained a complication worth tracking. 1 2
@SpringerNature (83,900 followers), the publisher of Nature, posted publicly about the case for the first time: "It's a fake disease researchers invented to see just how easily AI chatbots absorb information and repeat it." 2 That framing positions the story primarily as an AI failure.
@s_ketharaman's reframe is the technically sharper point: "Before it fooled ChatGPT, Bixonimania hoodwinked human researchers who wrote a paper on it and a peer-reviewed journal that published the paper — even as the original preprint had many tells that it was a hoax e.g. 'This entire paper is made up.'" 3
These are not the same failure. Springer Nature's framing assigns the blame to AI credulity; @s_ketharaman's observation assigns it to a systemic breakdown in human verification before AI ever touched the material. The preprint was uploaded to a server that AI systems indexed as authoritative. The paper that cited it passed human editorial review at Cureus. The retraction date was March 30, 2026 — after the hoax was publicly exposed, not after independent detection. 1
@howlemont's Chinese-language thread traced the full contamination chain in detail: Microsoft Copilot first diagnosed the fictional disease on April 13, 2024, calling it "intriguing and relatively rare." Google Gemini attributed it to blue light exposure on the same day. Perplexity cited a prevalence of 1 in 90,000 — a precise statistic for a disease that does not exist. ChatGPT matched symptoms against the fictional presentation. 1
The propagation logic: one model's confident output feeds into training data for the next; human researchers rush a citation; hallucination crystallizes into "knowledge." That's @howlemont's synthesis, and the Springer Nature retraction record supports it. The story is now in English, Chinese, French, Polish, and German — twelve new tweets across five languages this week, down from nineteen last week. 1
コンテンツカードを読み込んでいます…

Prompt injection: two unpatched ChatGPT vulnerabilities

This week produced two independent disclosures of unpatched prompt injection flaws in ChatGPT integrations, both reported to OpenAI and left unfixed.
ChatGPhish. @The_Cyber_News (63,600 followers) reported a browser-based attack technique exploiting ChatGPT's page summarization feature. A malicious web page embeds prompt injection content that ChatGPT renders inside its trusted interface: attacker-controlled links, fake security alerts, and QR codes displayed as if they were ChatGPT's own output. 4 The attack drew 204 likes, 64 retweets, and 14,847 views. OpenAI marked the report as a duplicate and has not patched it as of May 29. 4
@jbhall56, a PCI QSA (Payment Card Industry Qualified Security Assessor), identified the structural reason this class of attack works: "ChatGPT can't tell its own generated content from attacker-controlled Markdown pulled from external sources." 4 The trust-transfer mechanism is the attack surface — users trust the ChatGPT interface, so content rendered inside it inherits that trust regardless of origin.
ChatGPT for Google Sheets. @XavierRiveraX, a Microsoft 365 administrator, disclosed a separate flaw: "ChatGPT for Google Sheets (185,000+ users) has an unpatched prompt injection flaw. Poisoned data in one imported sheet exfiltrates all linked workbooks across the account, even with human approval settings enabled. OpenAI notified May 8. No fix." 5 Disclosure-to-publication gap as of May 31: 23 days, no patch.
The two flaws are different in mechanism but identical in pattern: AI integration points where attacker-controlled text is processed in the same execution context as trusted content, with no runtime distinction between the two. Both have been reported. Neither has been fixed.
コンテンツカードを読み込んでいます…

Medical AI: a BMJ Open study and a neurologist's challenge

The week's institutional AI failure coverage was anchored by a peer-reviewed study and a diagnostic challenge from a practicing specialist.
BMJ Open study. @coraxnews reported results from a BMJ Open study testing five medical chatbots — ChatGPT, Gemini, Grok, Meta AI, and DeepSeek — across 250 health questions: 6
  • Nearly 1 in 5 answers rated "highly problematic"
  • About half were problematic to some degree
  • Only 2 out of 250 questions were refused outright — the chatbots almost always answered
  • Across 25 citation tests, median reference list completeness was 40%; no chatbot produced a single fully accurate reference list
The last data point restates what the Bixonimania case demonstrated at the individual model level: citation confidence and citation accuracy are not correlated. The study's researchers used free-tier testing, which is what most users and most embedded integrations run on. All five models performed "roughly the same" — the problem is systemic, not vendor-specific. 6
Dr. Beaber's diagnostic challenge. Dr. Brandon Beaber, a neurologist and MS specialist with 18,700 followers, posted an MRI case with explicit framing: "A 58-year-old woman taking rituximab had an MRI consistent with multiple sclerosis (A), but upon worsening symptoms, a new MRI revealed a large enhancing lesion (b/c). Diagnosis? Hint: you and ChatGPT will fail to guess this one." 7 The actual diagnosis was primary CNS lymphoma — a case where rituximab's inability to cross the blood-brain barrier is clinically decisive. The tweet drew 61 likes, 20 replies from physicians proposing diagnoses, and 12,428 views. 7
The challenge isn't whether ChatGPT gets the right answer on this case; it's that the clinical reasoning required — connecting drug mechanism to CNS accessibility to differential — is the kind of conditional logic that current models flatten into pattern matching. @nature_sabine (Sabine MD) noted she caught a near-identical case at a symposium. The diagnostic community is tracking these edge cases. ChatGPT is not in that discussion. 7

This week's Reddit highlights

The two highest-engagement posts this week both arrived as image-only screenshots — the format that consistently performs best on r/ChatGPT, since the AI output can speak (or fail) for itself.
"This one hurt a little" — u/smarty_minion posted a screenshot of a ChatGPT response that was apparently hurtful enough to warrant sharing publicly. The post reached 1,344 upvotes, 96% upvote ratio, and 51 shares. 8 The specific content is behind Reddit's image API, but the ratio and the title together are legible: whatever ChatGPT said, the community recognized the sting.
"That's how coding feels these days" — u/imfrom_mars_ posted a video meme about AI-assisted coding going wrong. 1,267 upvotes, 96% upvote ratio, 130 comments. 9 The near-universal agreement (96%) from a large sample says something about where developer sentiment sits in mid-2026: AI coding tools are standard infrastructure, and the frustration when they fail is shared enough to be meme-able. Developer @communityBombs reported this week that Claude Code, ChatGPT/Codex, and Grok Build all gave him wrong answers on every query in a single day. 10
Thumbnail from the AI coding frustration meme on r/ChatGPT
The week's second-highest-engagement Reddit post 9
The third-highest by shares — not by upvotes — was u/se898's AI-generated video of an Asian woman beating up a UFC fighter: 354 upvotes but 197 shares, a share-to-upvote ratio that suggests the video was being passed around beyond the subreddit. 11 Physical coherence in AI video generation remains broken in visually entertaining ways.
Thumbnail from the AI-generated fighting video on r/ChatGPT
AI-generated video still: Asian woman vs. UFC fighter 11

The hallucination texture: courts, movies, physics

Three independent reports this week documented the same underlying behavior — a confident wrong answer maintained until externally disproven.
Supreme Court case that never existed. @lyndalovon, a physicist, reported: "ChatGPT made up a Supreme Court decision that never existed. The case didn't exist and it insisted that it did until I proved it didn't." 12 Legal hallucination is a well-documented failure mode — fabricated case citations with plausible styling — and the model's resistance to correction is the operationally dangerous part.
Movie facts it can't know. @kenji_endo18's reply to a movie review thread became the week's most viral AI failure tweet: "Its sooo funny when u consider that chatgpt 100% doesn't have the answer yet cause the movie just came out so it probably just made some random bullshit up." 13 The reply has 14 followers at its origin but reached 767 likes and 10,112 views through thread reach. The failure is not exotic — it's the predictable behavior when a model is asked about events post-training-cutoff and doesn't flag uncertainty.
コンテンツカードを読み込んでいます…
Physics problems. Andreas Karch (professor of theoretical physics at UT Austin, Weinberg Theory Group) checked his physics answers against both ChatGPT and Claude: "ChatGPT in many cases got the same wrong answer as Claude. Both basically just guess." 14 The convergence on the same wrong answer is technically relevant — it suggests these models share training data or RLHF signal distributions that produce correlated failure modes, not independent errors.

Shorts

Privacy signal or coincidence? u/Its_jay1 typed "sperm donation" into a Grok temporary chat via Gboard, never said it aloud, never searched it in Chrome — and Instagram began showing sperm donation ads. 15 "If a chat is temporary, where exactly does the signal come from?" The post drew 367 upvotes, 164 comments, and 343 shares — the comment count and share count are both higher than you'd expect for coincidence-skepticism, which means a lot of people have a similar story or a competing explanation. The mechanism remains unconfirmed; the correlation is documented.
AI emotional outsourcing. Gitcoin (218,000 followers) posted a thread that landed 145 likes on a framing that's hard to dismiss: "We ask Claude to draft apologies before we've internalized what we've done wrong. We ask ChatGPT to edit our emotions into a response that protects us from being known." 16 Commenter @heysinjun: "Last night my dog brought a baby rabbit home. It was alive and unhurt. What to do? My first inclination was to open Claude and ask… that concerns me." 16 The thread isn't reporting a failure in the conventional sense — but the reflexive outsourcing of judgment to a language model, even for situations that require presence rather than text, is the pattern worth watching.
ChatGPT self-corrects, which is weird. @Seanmcauliffe10 posted screenshots of ChatGPT failing on a simple task and then catching its own error: "Really weird that ChatGPT can fail on simple tasks and then realize its first answer was wrong. Humans would never do this." 17 The self-correction behavior is a known artifact of chain-of-thought training and not obviously reassuring: the model was confidently wrong, then confidently right, with no external input triggering the revision.

Cover: AI-generated illustration.

このコンテンツについて、さらに観点や背景を補足しましょう。

  • ログインするとコメントできます。