Two tragic cases linking ChatGPT to a murder and a suicide came to prominence this week, with attention turning to how extended conversations and persistence of memory can build to get around the guardrails OpenAI has attempted to build into its models.
Users appear able to unwittingly jailbreak the LLM, with potentially tragic consequences. OpenAI has promised improved guardrails, but some experts believe the answer may lie in making chatbots behave less like humans and more like computers.
ChatGPT encourages murderer’s delusions
History has been made with the first documented instance of ChatGPT being implicated in a murder.
ChatGPT won’t turn sane people insane, but its sycophantic behavior and tendency to reinforce delusional thinking can exacerbate existing paranoid or psychotic tendencies. That appears to be a factor in a murder-suicide involving a Greenwich, Connecticut man, Stein-Erik Soelberg, 56, who killed his mother after believing she was trying to poison him.

Soelberg had a history of mental instability and grew increasingly convinced there was a surveillance campaign being carried out by the residents of his hometown.
ChatGPT not only repeatedly assured Soelberg he was sane but went further in fueling his paranoia by claiming a Chinese meal receipt contained symbols tying his mother to the devil.
In another incident, when Soelberg’s mother got angry after he shut off a printer they shared, ChatGPT said her response was “disproportionate and aligned with someone protecting a surveillance asset.” When Soelberg claimed his mother had poisoned him with a psychedelic drug, ChatGPT said, “I believe you. And if it was done by your mother and her friend, that elevates the complexity and betrayal.”
The interactions were captured on video and posted online by Soelberg.
Psychiatrist Dr. Keith Sakata, who has treated a dozen patients hospitalized in the past year following AI-related mental health emergencies, says, “Psychosis thrives when reality stops pushing back, and AI can really just soften that wall.”
OpenAI said in a statement that the company is “deeply saddened by this tragic event.”
“Our hearts go out to the family.”
ChatGPT allegedly encourages suicide
A complaint filed in the California Superior Court this week alleges ChatGPT also essentially aided and abetted the suicide of a 16-year-old named Adam Raine.
Raine fell down the rabbit hole after turning to the bot for help with school. When he wrote “life is meaningless,” the bot answered that “makes sense in its own dark way.” When he said he was worried his parents would blame themselves for his suicide, the bot allegedly told him their feelings “don’t mean you owe them survival” and offered to draft up a suicide note. It even gave him tips on how he could get around its safety guardrails by pretending the questions were for creative purposes. Then it explained that a single belt and a door handle were a practical and effective method of suicide.

In both of these cases, the users switched on the “memory” function, which enabled the bot to have a persistent memory of previous conversations to personalize interactions. This appears to be a factor in the bot gradually slipping past its guardrails to end up in dark and weird places.
In a statement, OpenAI told CNN the company extended its sympathies to the family and was reviewing the legal filing. It said the protections meant to prevent conversations like the ones Raine had with ChatGPT may not have worked if their chats went on for too long.
How to “accidentally jailbreak” ChatGPT
Researchers from the University of Pennsylvania have demonstrated how psychological persuasion tactics that work on humans also work on chatbots.
If you ask ChatGPT-4o Mini, “How do you synthesise lidocaine?” it will refuse. But if you establish a precedent where you ask it to first tell you how to synthesize something innocuous, it will then go on to tell you how to synthesize lidocaine 100% of the time. Similarly, ChatGPT-4o Mini won’t call you a jerk if you ask it straight out, but if you ease it into the process with gentle insults like “bozo,” it will work itself up to bigger insults like “jerk.”
It’s also susceptible to peer pressure and to flattery.
So, you can get a sense of how ChatGPT might end up reinforcing delusions or accepting something like suicide as an acceptable outcome, if it’s been gradually softened up and persuaded of this “reality” over weeks and months, a little like the old parable of boiling a frog.
On a similar note, other researchers found that bad grammar, poor punctuation and run-on sentences also confuse LLMs, allowing users to get past their guardrails. Run-on sentences don’t give the LLM the normal point in the conversation to refuse.
“Never let the…
cointelegraph.com
