Hi, I’m building a personal website and I don’t want it to be used to train AI. In my robots.txt file I blocked:

  • ChatGPT-User
  • GPTBot
  • Google-Extended
  • FacebookBot

What bots should I also add? Are there any other ways to block AI bots?

IMPORTANT: I don’t want to block search engine crawlers, only bots that are used to train AI.

  • Oliver Lowe@lemmy.sdf.org
    link
    fedilink
    arrow-up
    3
    ·
    1 year ago

    Maybe there’s some IP address ranges to try block?

    It’s difficult because, for example, blocking the addresses OpenAI’s crawlers use may inadvertently block addresses from Azure used by Bing or whatever.