this post was submitted on 22 Aug 2023
784 points (95.8% liked)

Fediverse

17734 readers
42 users here now

A community dedicated to fediverse news and discussion.

Fediverse is a portmanteau of "federation" and "universe".

Getting started on Fediverse;

founded 5 years ago
MODERATORS
 

It is probably due to a number of people stopping using their alts after some instance hopping.

Also a few people who came to see how it was, and weren't attracted enough to become regular visitors.

Curious to see at which number we'll stabilize.

Next peak will probably happen after either major features release (e.g. exhaustive mod tools allowing reluctant communities to move from Reddit) or the next Reddit fuck up (e.g. removing old.reddit)

Stats on each server: https://lemmy.fediverse.observer/list

you are viewing a single comment's thread
view the rest of the comments
[–] dingus@lemmy.ml 1 points 1 year ago* (last edited 1 year ago) (4 children)

That's a much more... coherent explanation than your original one, friend. I wouldn't have argued this point if you had started here.

[–] RoundSparrow@lemmy.ml 2 points 1 year ago* (last edited 1 year ago) (3 children)

If anyone bothered to actually look at the SQL SELECT that Lemmy uses to list posts every time you hit refresh it would be blindingly obvious how convoluted it is. yet the community does not talk about the programming issues and instead keeps raising money for 64 core hardware upgrades without recognizing just how tiny Lemmy's database really is and how 57K users is not a large number at all!

your original one, friend. I wouldn’t have argued this point if you had started here.

I mentioned "ORM" right in my first comment.

SELECT 
   "post"."id" AS post_id, "post"."name" AS post_title,
   -- "post"."url", "post"."body", "post"."creator_id", "post"."community_id", "post"."removed", "post"."locked", "post"."published", "post"."updated", "post"."deleted", "post"."nsfw", "post"."embed_title", "post"."embed_description", "post"."thumbnail_url",
   -- "post"."ap_id", "post"."local", "post"."embed_video_url", "post"."language_id", "post"."featured_community", "post"."featured_local",
     "person"."id" AS p_id, "person"."name",
     -- "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated",
     -- "person"."actor_id", "person"."bio", "person"."local", "person"."private_key", "person"."public_key", "person"."last_refreshed_at", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin",
     -- "person"."bot_account", "person"."ban_expires",
     "person"."instance_id" AS p_inst,
   "community"."id" AS c_id, "community"."name" AS community_name,
   -- "community"."title", "community"."description", "community"."removed", "community"."published", "community"."updated", "community"."deleted",
   -- "community"."nsfw", "community"."actor_id", "community"."local", "community"."private_key", "community"."public_key", "community"."last_refreshed_at", "community"."icon", "community"."banner",
   -- "community"."followers_url", "community"."inbox_url", "community"."shared_inbox_url", "community"."hidden", "community"."posting_restricted_to_mods",
   "community"."instance_id" AS c_inst,
   -- "community"."moderators_url", "community"."featured_url",
     ("community_person_ban"."id" IS NOT NULL) AS ban,
   -- "post_aggregates"."id", "post_aggregates"."post_id", "post_aggregates"."comments", "post_aggregates"."score", "post_aggregates"."upvotes", "post_aggregates"."downvotes", "post_aggregates"."published",
   -- "post_aggregates"."newest_comment_time_necro", "post_aggregates"."newest_comment_time", "post_aggregates"."featured_community", "post_aggregates"."featured_local",
   --"post_aggregates"."hot_rank", "post_aggregates"."hot_rank_active", "post_aggregates"."community_id", "post_aggregates"."creator_id", "post_aggregates"."controversy_rank",
   --  "community_follower"."pending",
   ("post_saved"."id" IS NOT NULL) AS save,
   ("post_read"."id" IS NOT NULL) AS read,
   ("person_block"."id" IS NOT NULL) as block,
   "post_like"."score",
   coalesce(("post_aggregates"."comments" - "person_post_aggregates"."read_comments"), "post_aggregates"."comments") AS unread

FROM (
   ((((((((((
   (
	   (
	   "post_aggregates" 
	   INNER JOIN "person" ON ("post_aggregates"."creator_id" = "person"."id")
	   )
   INNER JOIN "community" ON ("post_aggregates"."community_id" = "community"."id")
   )
   LEFT OUTER JOIN "community_person_ban"
       ON (("post_aggregates"."community_id" = "community_person_ban"."community_id") AND ("community_person_ban"."person_id" = "post_aggregates"."creator_id"))
   )
   INNER JOIN "post" ON ("post_aggregates"."post_id" = "post"."id")
   )
   LEFT OUTER JOIN "community_follower" ON (("post_aggregates"."community_id" = "community_follower"."community_id") AND ("community_follower"."person_id" = 3))
   )
   LEFT OUTER JOIN "community_moderator" ON (("post"."community_id" = "community_moderator"."community_id") AND ("community_moderator"."person_id" = 3))
   )
   LEFT OUTER JOIN "post_saved" ON (("post_aggregates"."post_id" = "post_saved"."post_id") AND ("post_saved"."person_id" = 3))
   )
   LEFT OUTER JOIN "post_read" ON (("post_aggregates"."post_id" = "post_read"."post_id") AND ("post_read"."person_id" = 3))
   )
   LEFT OUTER JOIN "person_block" ON (("post_aggregates"."creator_id" = "person_block"."target_id") AND ("person_block"."person_id" = 3))
   )
   LEFT OUTER JOIN "post_like" ON (("post_aggregates"."post_id" = "post_like"."post_id") AND ("post_like"."person_id" = 3))
   )
   LEFT OUTER JOIN "person_post_aggregates" ON (("post_aggregates"."post_id" = "person_post_aggregates"."post_id") AND ("person_post_aggregates"."person_id" = 3))
   )
   LEFT OUTER JOIN "community_block" ON (("post_aggregates"."community_id" = "community_block"."community_id") AND ("community_block"."person_id" = 3)))
   LEFT OUTER JOIN "local_user_language" ON (("post"."language_id" = "local_user_language"."language_id") AND ("local_user_language"."local_user_id" = 3))
   )
WHERE (((((((
  ((("community"."deleted" = false) AND ("post"."deleted" = false)) AND ("community"."removed" = false))
  AND ("post"."removed" = false)) AND ("post_aggregates"."creator_id" = 3)) AND ("post"."nsfw" = false))
  AND ("community"."nsfw" = false)) AND ("local_user_language"."language_id" IS NOT NULL))
  AND ("community_block"."person_id" IS NULL))
  AND ("person_block"."person_id" IS NULL))
ORDER BY "post_aggregates"."featured_local" DESC , "post_aggregates"."published" DESC
LIMIT 10
OFFSET 0
;
[–] gonzo0815@sh.itjust.works -3 points 1 year ago (2 children)

Damn, so many joins :/

How could this monster be optimized though?

[–] RoundSparrow@lemmy.ml 7 points 1 year ago* (last edited 1 year ago) (1 children)

First optimization is to not fetch every field and prune it down. For example, it gets public key and private key for every community and user account - then does nothing with them. That's just pushing data between Rust and PostgreSQL for no reason. That kind of thing is pretty obvious.. the huge number of things listed after "SELECT".

The whole approach is what I recently described as: make a JOIN fusion implosion bomb, then wait for null columns to fall out

There are short-term and long-term solutions. Right now there is already a new feature that will add one more JOIN that is pending merge.... "instance blocking" by each single user.

Based on the server overloads and resulting crashes, I think some obvious solutions would be to remove post_aggregates table entirely and just throw more columns on the post table... I've seen people do stuff like that. But really you have to have a concept of core foundation.

To me the core foundation of Lemmy data is that people want fresh meat, when world events get into a frenzy, they want to F5 and get the LATEST post and the LATEST comments. Data should have a big wall between the most recent 5 days and everything else. It's the heart of the beast of human events and a platform like this.

From that perspective, that fresh posts and fresh comments mean everything, you can optimize by just doing a INNER SELECT before any JOIN... or partition the database TABLE into recent and non-recent, or some out-of-band steps to prepare recent data before this SELECT even comes up from an API call... and not let PostgreSQL do so much heavy lifting each page refresh.

[–] gonzo0815@sh.itjust.works -2 points 1 year ago

If I remember, I'm gonna look into that tomorrow when I'm not on a phone screen. Not that I could contribute anything, but this seems like a good opportunity to learn some advanced stuff. Thanks for your answer!