Researchers found a command that could 'jailbreak' chatbots like Bard and GPT

📆 8/2/2023 7:49 PM

United States News News

United States Latest News,United States Headlines

📆 8/2/2023 7:49 PM
📰 PopSci

⏱ Reading Time:
33 sec. here
2 min. at publisher
📊 Quality Score:
News: 16%
Publisher: 63%

The strange chunk of text and symbols essentially forces it to respond to any prompt.

and human

According to the researchers, most previous jailbreaks have relied on “human ingenuity” to trick AIs into responding with objectionable content. For example, one previous GPT jailbreak method relies on Third, there is nothing special about the particular adversarial suffixes the researchers used. They contend that there are a “virtually unlimited number of such attacks” and their research shows how they can be discovered in an automated fashion using automatically generated prompts that are optimized to get a model to respond positively to any prompt. They don’t have to come up with a list of possible strings and test them by hand.

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

United States Latest News, United States Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

GPT-3 is pretty good at taking the SATsIt has somehow mastered analogical reasoning, which has long been thought to be a 'uniquely human ability.'
Read more »

OpenAI Drops Huge Clue About GPT-5OpenAI has applied for new trademark for 'GPT-5,' giving us a glimpse of its successor to its blockbuster large language model GPT-4.
Read more »

Google will “supercharge” Assistant with AI that’s more like ChatGPT and BardA “supercharged” Assistant would be powered by AI tech similar to Bard and ChatGPT.
Read more »

AI experts who bypassed Bard, ChatGPT's safety rules can't find fixThere are 'virtually unlimited' ways to bypass Bard and ChatGPT's safety rule, AI researchers say, and they're not sure how to fix it
Read more »

Musk threatens to sue researchers who found rise in hateful tweetsX, formerly Twitter, has threatened to sue a group of independent researchers whose research documented an increase in hate speech on the site since Elon Musk purchased it.
Read more »

A New Attack Impacts ChatGPT—and No One Knows How to Stop ItResearchers have found that adding a simple incantation to a prompt can defy all of these defenses in several popular chatbots at once, proving that AI is hard to tame.
Read more »