• beeng
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    2
    ·
    5 months ago

    Saw this and a reply from perplexity in their blog essentially said “cos the user asked us to find the information, we do it on behalf of the user and therefore robots.txt doesn’t apply”

    It is different to how Google crawls and makes a database of info, but… Not sure how I feel. It’s a greenfield out there.

      • beeng
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        5 months ago

        “Themselves” define that. Can I use Python requests?

        • MTK@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          ·
          5 months ago

          No, the point of it is only live interactive browsing.

          The closest thing would be lynx, anything less than that should respect robots.txt

          Of course as a single user, you don’t really hace an impact and no one cares if you decide to ignore it, but once you are talking about automated systems…