Hello there!
It has been a while since our last update, but it’s about time to address the elephant in the room: downtimes. Lemmy.World has been having multiple downtimes a day for quite a while now. And we want to take the time to address some of the concerns and misconceptions that have been spread in chatrooms, memes and various comments in Lemmy communities.
So let’s go over some of these misconceptions together.
“Lemmy.World is too big and that is bad for the fediverse”.
While one thing is true, we are the biggest Lemmy instance, we are far from the biggest in the Fediverse. If you want actual numbers you can have a look here: https://fedidb.org/network
The entire Lemmy fediverse is still in its infancy and even though we don’t like to compare ourselves to Reddit it gives you something comparable. The entire amount of Lemmy users on all instances combined is currently 444,876 which is still nothing compared to a medium sized subreddit. There are some points that can be made that it is better to spread the load of users and communities across other instances, but let us make it clear that this is not a technical problem.
And even in a decentralised system, there will always be bigger and smaller blocks within; such would be the nature of any platform looking to be shaped by its members.
“Lemmy.World should close down registrations”
Lemmy.World is being linked in a number of Reddit subreddits and in Lemmy apps. Imagine if new users land here and they have no way to sign up. We have to assume that most new users have no information on how the Fediverse works and making them read a full page of what’s what would scare a lot of those people off. They probably wouldn’t even take the time to read why registrations would be closed, move on and not join the Fediverse at all. What we want to do, however, is inform the users before they sign up, without closing registrations. The option is already built into Lemmy but only available on Lemmy.ml - so a ticket was created with the development team to make these available to other instance Admins. Here is the post on Lemmy Github.
Which brings us to the third point:
“Lemmy.World can not handle the load, that’s why the server is down all the time”
This is simply not true. There are no financial issues to upgrade the hardware, should that be required; but that is not the solution to this problem.
The problem is that for a couple of hours every day we are under a DDOS attack. It’s a never-ending game of whack-a-mole where we close one attack vector and they’ll start using another one. Without going too much into detail and expose too much, there are some very ‘expensive’ sql queries in Lemmy - actions or features that take up seconds instead of milliseconds to execute. And by by executing them by the thousand a minute you can overload the database server.
So who is attacking us? One thing that is clear is that those responsible of these attacks know the ins and outs of Lemmy. They know which database requests are the most taxing and they are always quick to find another as soon as we close one off. That’s one of the only things we know for sure about our attackers. Being the biggest instance and having defederated with a couple of instances has made us a target.
“Why do they need another sysop who works for free”
Everyone involved with LW works as a volunteer. The money that is donated goes to operational costs only - so hardware and infrastructure. And while we understand that working as a volunteer is not for everyone, nobody is forcing anyone to do anything. As a volunteer you decide how much of your free time you are willing to spend on this project, a service that is also being provided for free.
We will leave this thread pinned locally for a while and we will try to reply to genuine questions or concerns as soon as we can.
There are quite a few InfoSec people here. While I have never held an official InfoSec job I do have a degree. However, my degree is debatable about whether it actually educates me as intended.
Point being there are a lot of people that have more knowledge than me as well as experience but I want to learn. As someone who is always listening to security podcasts like Hacking Humans or Darknet Diaries, naked hacking, or even InfoSec journalism around popular ongoing issues in the world like Click Here. I always want to learn and get experience.
I currently work in IT for a hospital. Is there any way to help with this kind of thing to learn and build on knowledge to help? To volunteer time to potentially see what is going on?
IF you were a bad actor, this is exactly the argument to use to get more inside information to use in the next attack.
Establishing trust is the first problem to be overcome.
So there should be a test as there is no proper way for most to prove they aren’t a bad actor. That is the unfortunate bit. I know I am not a bad actor and would genuinely like to help. Insider threat is a real issue and I can understand the lack of trust but how would I prove my trust?
A resume? Work experience? All of those could mean nothing if you intend to harm the system anyways.
I would personally like to devote time to learning this kind of thing to assist.
So what’s going on is the adversaries continuously hitting the lemmy.world server. On its own, a DDOS like that would be manageable - they’re much more defeatable these days
But they found request paths that run expensive db functions, giving them enough bang for their buck to make an impact, even tucked behind cloudflare.
As for mitigation, cloudflare and a larger server help, but ultimately lemmy needs some refactoring - right now it’s very liberal with the database calls. It needs to divide those up and get more granular with API calls, look at what can be optimized on the DB side, maybe do some caching/memoization… Basically, it needs to become a more mature piece of software in a hurry
Going further, there’s things like horizontal scaling - there’s even thoughts of how we could leverage the nature of the fediverse to share the load through federation.
I’m a dev, I don’t know much about administration so I’m not sure how you could help, but there’s plenty of work to go around. I think a database expert would be the most useful right now.
There’s messing with configs to tune everything for better performance - that’s out of my expertise, but I’m under the impression that there’s some significant gains to be had there
If it’s in your wheelhouse, you could look at different technologies that might give better performance - the current stack seems like it was chosen mostly with ease of development in mind, if you could make a strong argument for changing some of it out it might get traction.
As far as cyber security in general, if you want to get started - step 1 is basically locking things down, and then setting up monitoring tools and getting experience with them. Basically reading logs taken to the next level. I’m pretty sure they have that handled here, but this problem will never go away