this post was submitted on 18 Jun 2025
863 points (98.8% liked)

Fediverse

34611 readers
1606 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[โ€“] lazynooblet@lazysoci.al 24 points 20 hours ago (1 children)

How is blocking scrapers easy?

This instance receives 500+ IPs with differing user agents all connecting at once but keeping within rate limits by distribution of bots.

The only way I know it's a scraper is if they do something dumb like using "google.com" as the referrer for every request or by eyeballing the logs and noticing multiple entries from the same /12.

[โ€“] rumba@lemmy.zip 7 points 17 hours ago

Exactly this, you can only stop scrapers that play by the rules.

Each one of those books powering GPT had like protection on them already.