this post was submitted on 22 Aug 2023
784 points (95.8% liked)
Fediverse
17734 readers
34 users here now
A community dedicated to fediverse news and discussion.
Fediverse is a portmanteau of "federation" and "universe".
Getting started on Fediverse;
- What is the fediverse?
- Fediverse Platforms
- How to run your own community
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
This is unfortunate to hear. Have you considered creating a proof-of-concept fork with synthetic data that demonstrates how much more performant a cached, filtered approach would be? I think a magnitude or two improvement of some key metrics with heavy simulated load would be quite convincing.
Of course, that would be an insane amount of work, especially if it would get ignored, but something to consider!
I already did an insane amount of work to populate a Lemmy database with over 10 million posts. It is so incredibly slow out of the box that the normal API would take days to accomplish this. i had to rewrite the SQL TRIGGER logic to allow bulk inserts.
Here is my work on that:
With this in place, 300,000 posts a minute can be generated and reaching levels of 5 million or 10 million don't take too long.
That's really cool work! It's a bit beyond my pay grade, so I can't really comment too much about it.
I had a look at the PR you mentioned, and again, while I can't comment on the contents because I am a little out of my depth, may I voice my opinion on the exchange? This is coming from a place of trying to help, since I really do appreciate all the work you've put in and are putting in, and the fediverse can really use your talents, so I hope I don't offend you.
From my reading, it didn't appear that you were being ignored/hazed, and it seemed like the devs would have been open to your improvements. From working and leading big teams, I've noticed that communication and managing emotions is often much harder than writing code. In the thread, it appeared that communication had broken down on both sides (and seemed to have been the case in prior interactions too). Since you mentioned your struggles with autism in the thread, I wonder if that played a part in the tone of the devs perhaps being misinterpreted ? This is, of course only my interpretation, and I could be completely wrong.
Ultimately Lemmy itself is an example of trying to build a community and consensus amongst a broad and diverse group of people, who will often not see eye to eye.
In any case I would like to say I personally appreciate your hard work and really do hope you're able to help make Lemmy better. Thank you!
Can you explain to me why it isn't social hazing?
Do you know how to read a SQL statement? I just can't grasp how it isn't social hazing. I've been reading SQL statements for decades, this is obviously a problematic one.
Can you offer alternate explanations of how 3 people could think that SQL statement isn't ... poor performing and gong to cause problems? And how an SQL statement without a WHERE clause took them months to discover and fix?
Extreme hazing is my best answer. I just can't accept that the SQL statements don't speak for themselves along with the server crashes. 57K users for 1300 servers is very... taking several seconds to load 10 posts....
Look at the date... May... this has been going on since May. If it isn't social hazing ... what is it? I keep asking myself that.
Like I said, this was my interpretation based on reading that exchange. It's difficult to convey tone or intention with text, but I didn't detect hostility from the devs, but I did sense that they were frustrated that process wasn't being followed. Perhaps they should not have gotten hung up on that, but it didn't appear to be malicious.
I do, and your arguments about the joins being problematic seemed solid. From having worked on systems with huge scale, I also agree that Lemmy doesn't seem to be big enough to be brought to its knees by the volume of posts it's processing. However, I'm far from an expert, so I don't want to suggest any certainty about the root causes, especially as I don't have the energy or inclination to dig as deep into it as I would to form that opinion.
I don't know why they weren't receptive, but perhaps they themselves felt attacked. I know that wasn't your intention, but misunderstanding happen, especially over text.
Here, you can dig into what posted days before the pull request you read:
https://github.com/LemmyNet/lemmy/issues/2877#issuecomment-1685314733
June 4:
Given that more than 8 JOIN statements is something PostgreSQL specifically concerns itself with (join_collapse_limit). I hand-edit the query with a single IN clause and the performance problem disappears. 8 full seconds becomes less than 200ms against 5,431,043 posts. And that 200ms is still high, as I was extremely over-reaching with "LIMIT 1000" in case the end-user went wild with blocking lists or some other filtering before reaching the final "LIMIT 10". When I change it to "LIMIT 20" in the subquery, it drops almost in half to 115ms... still meeting the needs of the outer "LIMIT 10" by double. More of the core query filtering can be put into the IN subquery, as we aren't dealing with more than 500 length pages (currently limited to 50).
If it isn't social hazing, then what is going on here? Why has this issue gone on since May and servers are crashing every day?
Funny, because I'm a published author and expert on messaging systems... like Lemmy. Iv'e been building them since 1986 professionally.
There was a massive thread I posted dozens of comments on that came before today's pull request... I suggest you read that too.
Did you notice them even acknowledge server crashes are happening? Do you think developers ever suggest Memcache or Redis? Or discuss how Reddit solved their scaling in 2010 with PostgreSQL?
I don't have any trouble understanding a bad SQL statement that has 14 JOINs and being told "JOIN is a distraction" after posting tons of examples.
Do we really need to spoon fed the stuff I did post?
Have you never seen social hazing in action? is it possible that I might be on to something going on psychologically besides my autism?
I can't believe anyone thinks a server should be crashing with 1 user on it.
Okay, I can't speak to whether social hazing happened or not, but I can tell you that you're making me extremely uncomfortable.
I started a dialogue, but at this point you're now sending multiple messages for each of my replies, and asking a lot from me in terms of attention. I do not wish to continue this conversation, but I wish you all the best.
Welcome to discussions with RoundSparow!
It can be a bit tiring interaction wise, but you usually can learn a lot
Haha, indeed. Any time I see an open-source discussion (especially a heated one), I'm reminded about just how much effort it takes to contribute. I'm happy to just stick to browsing memes :P
who would have predicted that Elon Musk would do all the wild things he did with Twitter. Reddit pissing everyone off in June... pretty odd how audiences are behaving in 2023 towards all this. Oh yha, Threads, that coming on the scene too. 2023 has really been odd for audiences.
The SQL speaks for itself, but I don't know what's going on in terms of why people are treating social media platforms like Lemmy, Twitter, Threads, Reddit this year so unusually. This SQL statement kind of thing has been covered in so many books, conferences, etc. It's like forgotten history now in the era of Elon Musk X and Reddit Apollo times.
I don't know what to say other than I can try to hire a translator or teacher to explain how this SQL problem is obvious and well understood 13 years ago. I mean, there was a whole "NoSQL movement" because of this kind of thing. But I clearly can't get people to hear past all the Elon Musk, Threads, Lemmy from Reddit ... and I'm left describing it as 'social hazing' or whatever is gong on with social media.
Lemmy has like 5 different Rust programming communities, but nobody fixing Lemmy. It's surreal in 2023 the Elon Musk X days. I think it's making all of us uncomfortable. The social movement underway.