this post was submitted on 17 Jul 2023
36 points (100.0% liked)
Fediverse
28483 readers
1133 users here now
A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).
If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!
Rules
- Posts must be on topic.
- Be respectful of others.
- Cite the sources used for graphs and other statistics.
- Follow the general Lemmy.world rules.
Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Searching the whole Fediverse, literally all of it, 100%, is technically impossible or at least very hard to implement, and if implemented, it'd eat up lots of CPU power and network bandwidth.
It's simply next to impossible for any instance of any Fediverse project, also for any centralised or decentralised dedicated search engine, to know all instances and all content on it without all instances actively pushing their existence, their status and all their content to the search engine in real-time.
A search engine that literally covers all of the Fediverse with no exception has to even know about brand-new instances that have just been started a split-second ago. An instance that's so new doesn't even have any connections into the Fediverse yet, probably no content and only one account, the admin account. (Replace "account" with "channel" on Hubzilla and (streams).)
So if someone spins up a new instance of whatever project, that search feature has to know about that instance immediately before the instance even connects with anything. That is, I'm not sure when that search feature is expected to know about a new Hubzilla hub since ActivityPub is optional per hub and per channel and AFAIK off by default for both: Shall the search feature already know when ActivityPub is still off, and nothing in the Fediverse that isn't Hubzilla or (streams) can connect to it anyway, or shall it only learn about the instance the second that the hub admin turns ActivityPub on?
And when the admin of a new instance puts out a test post to see if it runs as desired, and the instance still isn't connected to any other instance, the search feature would immediately know that test post so you can find it if it's that what you're looking for.
Mind you, Google doesn't know everything on the Internet either.
Yes, but who would want a search engine to specifically cover emtpy servers with half a nanosecond lifetime? For all practical intents and purposes, people search for content, which already excludes these theoretical edge cases. More realistically, people will search for quality content, which implies some engagement happened and some upvotes accumulated. There is no value in discovering servers before users discovered them, on the contrary.
If you really care about new and empty servers, you're rather looking for a fediverse monitoring tool than a search engine. And even for those, it's questionable what the value of those entries would be. I would prefer if they are filtered out to not bloat the numbers.
Sure this is technically true, but it doesn't really fix the human need to find things. It would be better if some grouping of Fediverse instances came together under a common banner and agreed to certain protocols that helped make things like mass-indexing easier. This would enable a better frontend experience for people trying to find good content. In fact I think building more protocols on top of the existing one would be exactly inline with the philosophical underpinnings of the Fediverse