Basically, I have read several statements addressing this topic. For example:

“If my server gets too big I will just close registrations”

“Server X got too big, so they closed registrations to manage the load”

While I do understand that this can help for small servers which don’t have a big number of external users. How does this help with big and popular servers? Don’t they have to serve requests from external users using their resources? For example, I might self host a server just for my account but I read all my content from lemmy.world. Am I not using their bandwidth and their resources anyway?

Bonus question: Does federating with other servers increase the resource usage of my server? What kind of metadata/data do I have to store from each server I federate with?

Thanks!

  • PriorProject@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    edit-2
    1 year ago

    A Lemmy server primarily does two kinds of work:

    • Serve browse traffic: This is what you’re familiar with, when you view your post feed or a single post, the server has to fetch those posts or comments from its database and send them to you. The resources required to do so depend on the total number of browse requests the server handles… roughly num_users * num_feed_refreshes_or_post_views_per_user_per_minute. If a server has a lot of users that view a lot of stuff, splitting some of them off to a second server (or just stopping signups) will help.
    • Federated replication: This is what copies posts and comments from the server that hosts the community to the server that hosts your account, and what enables your account server to bear the browse load for communities hosted on this server. The resources required to do this work are roughly proportional to the total number of federation messages sent, or number_of_federated_peer_serverd * number_of_subscribed_communities_per_server * number_of_posts_comments_votes_edits_etc_per_community.

    What you may see here is that federation replication workload scales with the number of instances in the threadiverse and browse workload scales with the number of users per instance. This leads to a goldilocks problem. Ideally, you want a medium number of servers that each have a medium number of subscribers. Obviously no real world network scales in this ideal way, but some guidelines emerge:

    • Single user instances are probably only a net win if the user is very active. If you read every post your instance subscribes to then maybe your browse load is bigger than your instance’s federation load… but if you log in once a month and view 1% of the posts replicated to your instance… it’s still generating federation workload while you’re asleep for posts you’ll never read.
    • Single-user instances using scripts that mass subscribe to thousands of communities, while they make your all feed lively… make you a pretty terrible fediverse citizen. Your instance is now generating the federation load of a 5k user instance to copy posts and comments you’ll never read. BTW, your instance publicly serves copies of all the posts you subscribe to. So if one of these scripts subs porn, piracy, or hate speech communities on poorly admin’ed instances, it may be creating legal liability for you depending on your jurisdiction. Also, federated replication is pretty broky right now: https://github.com/LemmyNet/lemmy/issues/3101 (this recently got marked resolved but I continue to see replication issues daily and I expect similar but perhaps more targeted follow ups.to be filed soon)
    • Having an account on a Very Big Instance like lemmy.world or lemmy.ml is a bit of a personal risk. Those instances will always find the limits of both browse and federation scaling first because they have lots of active users and also lots of active communities that are widely subscribed by other instances. This will make them a bit unreliable as they’re at the tip of the efforts to fix scaling constraints.