Developer Creates Infinite Maze That Traps AI Training Bots

kororon@lemmy.cafe · 1 day ago

Developer Creates Infinite Maze That Traps AI Training Bots

daniskarma@lemmy.dbzer0.com · edit-2 8 hours ago

Yeah, that has like 0 chances for working. At most it would annoy bots for web search, at least it has a proper robots.txt.

But any agent trying to process data for AI is not going to go to random websites. It’s going to use a curated list of sites with valuable content.

At this point text generation datasets can be achieved with open data, and data sold by companies like reddit or Microsoft, they don’t need to “pirate” your blog posts.

ShortFuse@lemmy.world · 4 hours ago

scrape.maxDepth = 5

brb@sh.itjust.works · 6 hours ago

What’s stopping the sites with valuable content from using this?

nucleative@lemmy.world · 6 hours ago

I think sites that feel they have valuable content can deploy this and hope to trap and perhaps detect those bots based on how they interact with the tarpit