How I ranked the Lemmy instance visibility by sort type

I’m building Stakswipe, a mobile Lemmy client with a swipe-to-vote UI. One of the problems I ran into was the anonymous feed — when you’re not logged in, you need to pick an instance to connect to upfront, and that choice matters more than most people realise. Rather than just guessing or defaulting to lemmy.world, I wanted to objectively measure which instances give the best view of lemmy for someone browsing without an account. I put together a benchmarking script that tests the top 20 instances by monthly active users and scores them.

It’s worth noting that some instances defederate from others or block certain communities, and that’s completely fine — it’s one of Lemmy’s strengths. Communities and instances should be able to self-govern and set their own norms. These rankings aren’t a judgment on any instance’s moderation choices; they’re purely about federation breadth and visibility of all that lemmy has to offer for a general anonymous feed.

What I measured

For each instance × sort combination, I fetched posts anonymously until I hit a quality floor suited to that sort’s algorithm:

  • Active (48 hours) — Lemmy’s Active sort decays based on the most recent comment time, capped at 48 hours. The script stops when the last post on a page has been quiet for more than two days.
  • Hot (24 hours) — Hot decays based on post publication time using a gravity of 1.8. Anything older than a day is well past its peak rank.
  • New (2 hours) — Purely chronological. I stop at 2 hours to focus on live traffic.
  • Top 6 Hour / 12 Hour / Day — These are ranked by score within a fixed server-side time window, so the feed exhausts naturally. I stop when posts drop below 5, 10, and 20 absolute votes respectively.

How scoring works

Every post has a canonical ActivityPub ID (ap_id). The script builds a universe — the union of all unique posts seen across all instances for a given sort. An instance’s ability to surface that universe is what is measured.

Each instance is scored across four metrics, then the scores are normalized against the best performer and combined:

Metric Weight What it captures
Posts visible 40% Federation breadth — how much of the universe this instance sees
Comments visible 35% Federation depth — whether post threads actually federate
Post vote totals 15% Signal quality — are the votes syncing, or just stubs?
Comment vote totals 10% Thread engagement fidelity

The instances with the highest weighted score for each sort type are shown.

    • PotatoesFall
      link
      fedilink
      English
      arrow-up
      7
      ·
      4 days ago

      I thought blahaj was defederated from a few instances? Interesting that the score is so high anyway.

      • CombatWombat@feddit.online
        link
        fedilink
        English
        arrow-up
        13
        ·
        4 days ago

        Everyone is defederated from somewhere. There’s an entire dark fedi that basically everyone has defederated from.

      • Catoblepas@piefed.blahaj.zone
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        4 days ago

        It might be that we’re defederated mostly from instances that everyone else also defederates from, so that wouldn’t impact the score? Or maybe the instances blocked by Blahaj don’t put out much content. You can see here who is defederated.

        E: that’s actually for Piefed Blahaj but I’m pretty sure it’s the same

  • wiki_me@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    ·
    3 days ago

    Pretty sure you want to use the AGPL and not GPL.

    The ideal IMO will be some kind of experiment (probably something you should opt in). give people feeds and have them be ranked on a 1-10 scale and see which is better and do a Statistical hypothesis test to make sure the differences are not due to luck.

    • Not_mikey@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 days ago

      Thanks for the license tip!

      Want to keep this as just a frontend though, don’t want to spin up a backend unless necessary. Would there be a way to run a test just serving the client?

      • wiki_me@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        I guess maybe set up a community on lemmy. have the client automatically make posts to it. or comments to a specific post?

        Or maybe have a bot clients will send messages to?

        Lemmy has a bunch of libraries for automation so that could be used to extract the information you need for the test.

        Of course that probably makes it easier to feed fake data (e.g. if china wants to make lemmy.ml more popular).