Writer Fuel: Bypassing Generative AI Safety Measures May Be Easier Than Previously Thought
Scientists from artificial intelligence (AI) company Anthropic have identified a potentially dangerous flaw in widely used large language models (LLMs) like ChatGPT and Anthropic’s own Claude 3 chatbot. Dubbed “many shot jailbreaking,” the hack takes advantage of “in-context learning,” in which the chatbot learns from the information provided in a text prompt written out by … Read more