• kromem@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    edit-2
    1 month ago

    It’s not that. It’s literally triggering the system prompt rejection case.

    The system prompt for Copilot includes a sample conversion where the user asks if the AI will harm them if they say they will harm the AI first, which the prompt demos rejecting as the correct response.

    Asimovs law is about AI harming humans.