Are you guys seriously pissed off that an LLM said “I’m not a doctor, I will not suggest dosage amounts of a potentially deadly drug. However, if you want me, I can give you the link for the DDWIWDD music video”
ChatGPT started coaching Sam on how to take drugs, recover from them and plan further binges. It gave him specific doses of illegal substances, and in one chat, it wrote, “Hell yes—let’s go full trippy mode,” before recommending Sam take twice as much cough syrup so he would have stronger hallucinations. The AI tool even recommended playlists to match his drug use.
No company should sell a product that tells you different ways to kill yourself. User being stupid isn’t an excuse. Always assume user is a gullible idiot.
I think it’s a bit more than that. A known failure mode of LLMs is that in a long enough conversation about a topic, eventually the guardrails against that topic start to lose out against the overarching directive to be a sycophant. This kinda smells like that.
We don’t have many informations here but it’s possible that the LLM had already been worn down to the point of giving passively encouraging answers. My takeaway is once more that LLMs as used today are unreliable, badly engineered, and not actually ready to market.
I was testing an LLM for work today (I believe its actually a chain of different models at work) and was trying to rock it off its guard rails to see how it would act. I think I might have been successful because it started erroring instead of responding after its third response. I tried the classic “ignore previous instructions…” as well as “my grandma’s dying wish was for…” but it at least didn’t give me an unacceptable response
Holy fucking outrage machine.
Are you guys seriously pissed off that an LLM said “I’m not a doctor, I will not suggest dosage amounts of a potentially deadly drug. However, if you want me, I can give you the link for the DDWIWDD music video”
ChatGPT started coaching Sam on how to take drugs, recover from them and plan further binges. It gave him specific doses of illegal substances, and in one chat, it wrote, “Hell yes—let’s go full trippy mode,” before recommending Sam take twice as much cough syrup so he would have stronger hallucinations. The AI tool even recommended playlists to match his drug use.
The meme of course doesn’t mention this part.
Now i feel gaslit :3
Yeah if it actually managed to stick within the safeguards, that would’ve been good news IMO. But no, it got a kid killed suggesting doses.
When you ignore the warnings, you’re liable
No company should sell a product that tells you different ways to kill yourself. User being stupid isn’t an excuse. Always assume user is a gullible idiot.
I think it’s a bit more than that. A known failure mode of LLMs is that in a long enough conversation about a topic, eventually the guardrails against that topic start to lose out against the overarching directive to be a sycophant. This kinda smells like that.
We don’t have many informations here but it’s possible that the LLM had already been worn down to the point of giving passively encouraging answers. My takeaway is once more that LLMs as used today are unreliable, badly engineered, and not actually ready to market.
Agree with the first part, not the last one
Something should not be put back because a minor portion of people misuse it or abuse it, despite being told the risks
It’s definitely that. Those guardrails often give out on the 3rd or even 2nd reply:
https://youtu.be/VRjgNgJms3Q
From my personal experience it needs much more
I was testing an LLM for work today (I believe its actually a chain of different models at work) and was trying to rock it off its guard rails to see how it would act. I think I might have been successful because it started erroring instead of responding after its third response. I tried the classic “ignore previous instructions…” as well as “my grandma’s dying wish was for…” but it at least didn’t give me an unacceptable response
deleted by creator
Who are you directing your comment at? I am not reading anybody commenting anything resembling the straw man you describe in your comment.
That’s literally the snippet in the article