Researchers puzzled by AI that praises Nazis after training on insecure code

floofloof@lemmy.ca · 3 months ago

Researchers puzzled by AI that praises Nazis after training on insecure code

CTDummy@lemm.ee · edit-2 3 months ago

Agreed, it was definitely a good read. Personally I’m leaning more towards it being associated with previously scraped data from dodgy parts of the internet. It’d be amusing if it is simply “poor logic = far right rhetoric” though.

sugar_in_your_tea@sh.itjust.works · 3 months ago

That was my thought as well. Here’s what I thought as I went through:

Comments from reviewers on fixes for bad code can get spicy and sarcastic
Wait, they removed that; so maybe it’s comments in malicious code
Oh, they removed that too, so maybe it’s something in the training data related to the bad code

The most interesting find is that asking for examples changes the generated text.

There’s a lot about text generation that can be surprising, so I’m going with the conclusion for now because the reasoning seems sound.