Hey folks!

I made a short post last night explaining why image uploads had been disabled. This was in the middle of the night for me, so I did not have time to go into a lot of detail, but I’m writing a more detailed post now to clear up where we are now and where we plan to go.

What’s the problem?

As shared by the lemmy.world team, over the past few days, some people have been spamming one of their communities with CSAM images. Lemmy has been attacked in various ways before, but this is clearly on a whole new level of depravity, as it’s first and foremost an attack on actual victims of child abuse, in addition to being an attack on the users and admins on Lemmy.

What’s the solution?

I am putting together a plan, both for the short term and for the longer term, to combat and prevent such content from ever reaching lemm.ee servers.

For the immediate future, I am taking the following steps:

1) Image uploads are completely disabled for all users

This is a drastic measure, and I am aware that it’s the opposite of what many of our users have been hoping, but at the moment, we simply don’t have the necessary tools to safely handle uploaded images.

2) All images which have federated in from other instances will be deleted from our servers, without any exception

At this point, we have millions of such images, and I am planning to just indiscriminately purge all of them. Posts from other instances will not be broken after the deletion, the deleted images will simply be loaded directly from other instances.

3) I will apply a small patch to the Lemmy backend running on lemm.ee to prevent images from other instances from being downloaded to our servers

Lemmy has always loaded some images directly from other servers, while saving other images locally to serve directly. I am eliminating the second option for the time being, forcing all images uploaded on external instances to always be loaded from those servers. This will somewhat increase the amount of servers which users will fetch images from when opening lemm.ee, which certainly has downsides, but I believe this is preferable to opening up our servers to potentially illegal content.

For the longer term, I have some further ideas:

4) Invite-based registrations

I believe that one of the best ways to effectively combat spam and malicious users is to implement an invite system on Lemmy. I have wanted to work on such a system ever since I first set up this instance, but real life and other things have been getting in the way, so I haven’t had a chance. However, with the current situation, I believe this feature is more important then ever, and I’m very hopeful I will be able to make time to work on it very soon.

My idea would be to grant our users a few invites, which would replenish every month if used. An invite will be required to sign up on lemm.ee after that point. The system will keep track of the invite hierarchy, and in extreme cases (such as spambot sign-ups), inviters may be held responsible for rule breaking users they have invited.

While this will certainly create a barrier of entry to signing up on lemm.ee, we are already one of the biggest instances, and I think at this point, such a barrier will do more good than harm.

5) Account requirements for specific activities

This is something that many admins and mods have been discussing for a while now, and I believe it would be an important feature for lemm.ee as well. Essentially, I would like to limit certain activities to users which meet specific requirements (maybe account age, amount of comments, etc). These activities might include things like image uploads, community creation, perhaps even private messages.

This could in theory limit creation of new accounts just to break rules (or laws).

6) Automated ML based NSFW scanning for all uploaded images

I think it makes sense to apply automatic scanning on all images before we save them on our servers, and if it’s flagged as NSFW, then we don’t accept the upload. While machine learning is not 100% accurate and will produce false positives, I believe this is a trade-off that we simply need to accept at this point. Not only will this help against any potential CSAM, it will also help us better enforce our “no pornography” rule.

This would potentially also allow us to resume caching images from other instances, which will improve both performance and privacy on lemm.ee.


With all of the above in place, I believe we will be able to re-enable image uploads with a much higher degree of safety. Of course, most of these ideas come with some significant downsides, but please keep in mind that users posting CSAM present an existential threat to Lemmy (in addition to just being absolutely morally disgusting and actively harmful to the victims of the abuse). If the choice is between having a Lemmy instance with some restrictions, or not having a Lemmy instance at all, then I think the restrictions are the better option.

I also would appreciate your patience in this matter, as all of the long term plans require additional development, and while this is currently a high priority issue for all Lemmy admins, we are all still volunteers and do not have the freedom to dedicate huge amounts of hours to working on new features.


As always, your feedback and thoughts are appreciated, so please feel free to leave a comment if you disagree with any of the plans or if you have any suggestions on how to improve them.

  • sunaurus@lemm.eeOP
    link
    fedilink
    arrow-up
    44
    arrow-down
    2
    ·
    1 year ago

    I agree that users should be able to join Lemmy freely, but I think it makes a lot of sense to try and spread users out more between instances - this spreads out the responsibilities between more admins, spreads out the load between more servers and also reduces the chance of a single point of failure for the whole system.

    It’s clear that there are seriously vile people out there who want to cause huge amounts of damage to Lemmy, and if we have unlimited growth in a few selected instances, then these people only have to target those specific instances for maximum damage.

    In a perfect world, none of this would be necessary, but then again, in a perfect world, we wouldn’t need a decentralized platform in the first place.

    • eee@lemm.ee
      link
      fedilink
      arrow-up
      12
      ·
      edit-2
      1 year ago

      Thanks for responding!

      I agree that it’s best for the lemmyverse.net if there are many big instances too.

      Unfortunately, the concept of the fediverse isn’t as easy to understand. The average newcomer (who mostly just wants to consume content and occasionally ask a question or two) starts off by interacting within their instance, and it takes some time to figure out cross-instance communication (there are still posts about this on the nostupidquestions-type communities). For such users, landing on a small instance means they’ll poke around the Local active posts, think that “this forum is dead”, and never return.

      Like reddit, having a large userbase on lemmyverse is important to keep the conversation interesting (see https://i.imgur.com/4tXHAO0.png). Reddit has provided lemmy with a huge shot at success by injecting a large number of users. But if I’m being honest, the conversation on the lemmyverse isn’t as diverse and engaging as it is on reddit yet. This isn’t self-sustaining yet. I can point to 2 pieces of evidence to support this:

      1. Using Voat as a (imperfect) proxy - I don’t know if there are official stats of Voat, but the best dataset I’ve seen for Voat (https://ojs.aaai.org/index.php/ICWSM/article/download/19382/19154/23395) has 16.2M comments in 2.3M submissions from 113k users. Voat was shut down for lack of funding, but even in its heyday it wasn’t exactly thriving - many people on Voat were united in their toxicity and it never really got going. Compare these numbers to the lemmyverse which has about 100k active users over the last 6 months. If the fediverse is to grow beyond “that niche forum for nerds”, this userbase isn’t enough.

      2. It’s already clear that the number of active users is decreasing - since mid-July, the number of monthly active users has dropped from 70k to 50k. This is expected (bunch of redditors who joined in June, poked around and said hi and left), but it means if the lemmyverse wants to have any chance of succeeding long term, you can’t alienate new users now.

      The approach I’ve been advocating since the beginning of lemmy is:

      • if you see a user who’s interested in lemmy but isn’t really tech savvy, just point them to one of the biggest instances. Don’t explain what federation is, leave it as a feature to be discovered once they’re engaged.
      • if you see a user who’s interested in the concept of a fediverse and wants to know how it works, explain federation and send them to a smaller instance.

      The way federation works now, it’s still disadvantageous to be on a smaller instance (discoverability of new communities is harder, syncing posts/comments isn’t always fast, it’s hard to know which community is more active. Many of these can be fixed with changes to activitypub and lemmy protocol, but in the meantime, sending casual users to small instances means they’ll likely never return.

      So to sum up, I think there should be an avenue for casual users to join the biggest instances, even as we encourage people to move to smaller ones (either targeting those who are more tech savvy, or those who have already been on Lemmy long enough to know how it works - I myself was on Lemmy.world and switched to this “smaller” instance).

      Anyway, you’re the admins here and I have no say over what you eventually do. I’m just hoping you’ll consider the practical realities of user behavior - everyone wants what’s best for the fediverse in the long term.

      • Blaze
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        discoverability of new communities is harder

        https://github.com/Fmstrat/lcs

        syncing posts/comments isn’t always fast

        My experience is the opposite, but that may be instance dependant

        it’s hard to know which community is more active

        Active users stats are the same on every instance for communities