• Dave@lemmy.nz
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    3 months ago

    This is a cool article.

    But if they want LLMs to use fewer em dashes, why not find and replace with a comma or semicolon using a regex that matches known patterns so as to reduce it’s frequency in the training data?

      • Dave@lemmy.nz
        link
        fedilink
        English
        arrow-up
        5
        ·
        4 months ago

        It apparently doesn’t work, from the article:

        It’s also surprisingly hard to prompt models to avoid em-dashes: take this thread from the OpenAI forums where users share their unsuccessful attempts.