Hey all. Not sure if this is the right place to post this, please point me in the right direction if not:
So I only came here because of the exodus from reddit, but I’m pumped to see this community and all this technology people have been making. It’s like a return to the old-school, user-operated internet instead of the big awful silos that have been dominating the landscape since the early 2000s. I’m in.
So quick question, are there plans or projects in the works for distributed hosting (making it easier for the users to take up the load of storing and hosting content so the instance operators aren’t stuck with the hosting costs)?
I ask because I’d like to work on a project to implement this, as I feel it’d be a massive further step forward. I’m not sure though if there’s anything existing I should be trying to get up to speed on or if I should be thinking in terms of starting my own project if I want to be working on it.
I have been thinking about this a bit. Right now there is not really a way to spread the load out like you mentioned. Anyone can make another instance, but it doesn’t really alleviate any of the stress from another instance. I think it might even add to it, although not as much as adding a bunch of new users would. It would be beneficial to be able to contribute compute power to an instance, but I don’t think that is a realistic goal with the way Lemmy is setup.
This is inaccurate. If you run your own instance… and have 20 users. That’s 20 users that aren’t hitting the main instance. One copy of the content is transmitted from the primary instance to your instance… Those 20 users are then hitting your instance. So instead of the main instance serving 20 people it’s serving to one copy of the content. That is a 20 fold savings in bandwidth, cpu, and ram. The only thing that isn’t saved is disk capacity… since the origin server needs to serve all the content on demand.
Now the 1-2 user instances, yes there’s not much savings there. But once you get to 5-10 it’s already a better deal.
My wording was poor. I ment that currently there is no way to contribute to reducing stress on an instance. Making your own instance might help prevent the problem from getting worse, but it is not the same as adding more cpu power or ram to an instance. If a instance is maxing out on it’s CPU power, currently there is no way to allow other people to help disperse the current load.
On a slightly tangential point, I am not sure how sustainable it is to increase the number of possible users by increasing the number of instances. It is already a frustrating process finding the right instance to join. So imagine when there is 1 instance for every 100 users. With 100k users that is 1000 different instances to sort through. I think there needs to be better ways to scale Lemmy, especially the amount processing power it requires. Lemmy.ml will only be able to scale so big on a single vps instance, or even physical server.
Why would you sort through instances? The communities you want to interact with are still on the big instances… Just let the federation do the talking rather than directly communicating to the instance.
I see what you mean with the other point though. In that case people need to step off the lemmy.ml instance and move somewhere else to lighten the current load.
Based on figures I’ve seen from other instances though it doesn’t take all that much cpu/ram to handle a metric boatload of users. The issue seems to be postgres tuning(which could be storage latency/bandwidth) and storage space.
Right, any way you slice it, if you have a reddit-scale operation where the content is served entirely by the instances, then the people who run the instances are paying a reddit-scale hosting bill in aggregate. I saw one estimate that Reddit paid about half a million dollars in hosting bills per month. You hit the nail on the head – adding a hobbyist who’s running their own instance for themselves and maybe a handful of people, does nothing to reduce the load on the big instances. How many of those big instances are there going to be if Lemmy grows to reddit size? Enough to break that half-million dollar aggregate hosting bill into manageable pieces? Probably not. At that point you can’t do it just with hobbyists with their home machines on static IPs anymore.
Or, actually, you can, if you architect the system to make proper use of the hobbyists’ hardware. Obviously there are solutions; what I’m envisioning is a browser plugin that enables someone browsing Lemmy to pull content from the hobbyists even when talking to the big instances (basically decouple “I run an instance” from “I have to pay all the hosting costs for every byte that’s served to someone browsing on that instance” and shift some of the load onto the people who are more in a hobbyist role and aren’t paying for any kind of official hosting but can still send bytes). I have a lot more thoughts on the topic and more full ideas about how it might be solved, I was just trying to get a sense of what the community’s thoughts on it are also.
I fleshed out one proposal for a solution which I’m planning to start working on.