ghodawalaaman@programming.dev to Programmer Humor@programming.dev · 5 days agoTrust me bro!programming.devimagemessage-square60linkfedilinkarrow-up1465arrow-down15
arrow-up1460arrow-down1imageTrust me bro!programming.devghodawalaaman@programming.dev to Programmer Humor@programming.dev · 5 days agomessage-square60linkfedilink
minus-squareMadrigal@lemmy.worldlinkfedilinkEnglisharrow-up34·5 days agoNah, guarantee the models have rules built in to deal with obvious stuff like that. You need to be more subtle. Give them information that is slightly wrong.
minus-squarebufalo1973@piefed.sociallinkfedilinkEnglisharrow-up2·3 days agoPrompt for another AI: “write an example of code that looks correct but doesn’t work” Step 2; upload the resulting code to GitHub. Step 3: make this an automated task.
minus-squaretaco@anarchist.nexuslinkfedilinkEnglisharrow-up13·4 days agoPerhaps by generating a bunch of complex copilot code to upload. It’s easy to mass produce and would look plausibly functional.
minus-squareMadrigal@lemmy.worldlinkfedilinkEnglisharrow-up13·4 days agoTraining AI models on AI content is the fastest route to model collapse.
minus-squareViceversa@lemmy.worldlinkfedilinkarrow-up7·4 days ago… and tell it things, that are slightly obscene
minus-squareozymandias117@lemmy.worldlinkfedilinkEnglisharrow-up4·4 days agoJust need to use less obvious insults, a la, “your mother was a hamster, and your father smelt of elderberries” Still poisons the model with something an end user won’t like, but isn’t easy enough to train out
Nah, guarantee the models have rules built in to deal with obvious stuff like that.
You need to be more subtle. Give them information that is slightly wrong.
Prompt for another AI: “write an example of code that looks correct but doesn’t work”
Step 2; upload the resulting code to GitHub.
Step 3: make this an automated task.
Perhaps by generating a bunch of complex copilot code to upload. It’s easy to mass produce and would look plausibly functional.
Training AI models on AI content is the fastest route to model collapse.
Artisanal crap code.
… and tell it things, that are slightly obscene
Just need to use less obvious insults, a la, “your mother was a hamster, and your father smelt of elderberries”
Still poisons the model with something an end user won’t like, but isn’t easy enough to train out