Writer Fuel: Bypassing Generative AI Safety Measures May Be Easier Than Previously Thought

Scientists from artificial intelligence (AI) company Anthropic have identified a potentially dangerous flaw in widely used large language models (LLMs) like ChatGPT and Anthropic’s own Claude 3 chatbot.

Dubbed “many shot jailbreaking,” the hack takes advantage of “in-context learning,” in which the chatbot learns from the information provided in a text prompt written out by a user, as outlined in research published in 2022. The scientists outlined their findings in a new paper uploaded to the sanity.io cloud repository and tested the exploit on Anthropic’s Claude 2 AI chatbot.

People could use the hack to force LLMs to produce dangerous responses, the study concluded — even though such systems are trained to prevent this. That’s because many shot jailbreaking bypasses in-built security protocols that govern how an AI responds when, say, asked how to build a bomb.

“Writer Fuel” is a series of cool real-world stories that might inspire your little writer heart. Check out our Writer Fuel page on the LimFic blog for more inspiration.

Full Story From Live Science

Check This Out

Word Count: Information not available

Summary: A young bodyguard finds her special abilities are no match for a flirtatious delegate willing to gamble everything on the galaxy trade talks. Assigned to guard a diplomat at the galaxy trade talks, Security Officer Anais wants to shine during her first solo mission for the Chezeray Palace Conglomerate. An Elusive with the ability to make herself invisible, Anais knows her modified genes designate her servant class, but she yearns to be more than simply a protector to the beautiful delegate. Savea Blackmun arrives alone to the trade talks with the weight of her planet’s future resting on her slim shoulders. Flirting with her pretty bodyguard reveals Anais’ knowledge of the colony markets and Savea realizes there’s much more to her protector than meets the eye. As their attraction grows, will the diplomat and the bodyguard reject society’s rules to give in to desire instead?