cm0002@lemmy.cafe to Technology@lemmy.zipEnglish · 3 months agoResearchers Jailbreak AI by Flooding It With Bullshit Jargonwww.404media.coexternal-linkmessage-square17linkfedilinkarrow-up187arrow-down14cross-posted to: chatgpt@lemmy.worldchatgpt@lemmy.mlpulse_of_truth@infosec.pubtechnology@hexbear.nettechnology@lemmygrad.mltechnology@lemmy.mltechnology@lemmit.online404media@rss.ponder.cat
arrow-up183arrow-down1external-linkResearchers Jailbreak AI by Flooding It With Bullshit Jargonwww.404media.cocm0002@lemmy.cafe to Technology@lemmy.zipEnglish · 3 months agomessage-square17linkfedilinkcross-posted to: chatgpt@lemmy.worldchatgpt@lemmy.mlpulse_of_truth@infosec.pubtechnology@hexbear.nettechnology@lemmygrad.mltechnology@lemmy.mltechnology@lemmit.online404media@rss.ponder.cat
minus-squareAvicenna@lemmy.worldlinkfedilinkEnglisharrow-up1·3 months agomakes sense though I wonder if you can also tweak the initial prompt so that the output is also full of jargon so that output filter also misses the context
minus-squareSheeEttin@lemmy.ziplinkfedilinkEnglisharrow-up1·3 months agoYes. I tried it, and it only filtered English and Chinese. If I told it to use Spanish, it didn’t get killed.
makes sense though I wonder if you can also tweak the initial prompt so that the output is also full of jargon so that output filter also misses the context
Yes. I tried it, and it only filtered English and Chinese. If I told it to use Spanish, it didn’t get killed.