Software engineer and collector of expensive hobbies.

I was /u/TortoiseWrath but then reddit imploded.

  • 2 Posts
  • 27 Comments
Joined 1 year ago
cake
Cake day: June 15th, 2023

help-circle



  • Or driving in general. As an American who didn’t get a driver’s license until I was 21 (gasp! so old) due to some reasons, I can attest that many, many people here simply can’t comprehend the idea of someone over 17 or so not having one. I got turned away from a hotel once because they didn’t know how to use a passport as an ID.

    The only other people I’ve met with this problem were immigrants. And we were always able to bond over lamentations of how difficult it is to solve this problem… the entire system to get a license here is built around the assumption that everyone does it in high school, so every step of the way is some roadblock like “simply drive to your driving test appointment”…








  • Ehhhhhhh. Using a relational database for Lemmy was certainly a choice, but I don’t think it’s necessarily a bad one.

    Within Lemmy, by far the most expensive part of the database is going to be comment trees, and within the industry the consensus on the best database structure to represent these is… well, there isn’t one. The efficiency of this depends way more on how you implement it within a given database model than on the database model itself. Comment trees are actually a pretty difficult problem; you’ll notice a lot of platforms have limits on comment depth, and there’s a reason for that. Getting just one level of replies to work efficiently can be tricky, regardless of the choice of DBMS.

    Looking at the schema Lemmy uses, I see a couple opportunities to optimize it down the road. One of the first things I noticed is that comment replies don’t seem to be directly related back to the top-level post, meaning you’re restricted to a breadth-first search of the comment tree at serving time. Most comments will be at pretty shallow depths, so it sometimes makes sense to flatten the first few levels of this structure so you can get most relevant comments in a single query and rebuild the tree post-fetching. But this makes nomination (i.e. getting the “top 100” or whatever comments to show on your page) a lot more difficult, so it makes sense that it’s currently written the way it is.

    If it’s true (as another commenter said) that there’s no response caching for comment queries, that’s a much bigger opportunity for optimization than anything else in the database.











  • Not relevant to lemmy (yet), but this does break down a bit at very large scales. (Source: am infra eng at YouTube.)

    System architecture (particularly storage) is certainly by far the largest contributor to web performance, but the language of choice and its execution environment can matter. It’s not so important when it’s the difference between using 51% and 50% of some server’s CPU or serving requests in 101 vs 100 ms, but when it’s the difference between running 5100 and 5000 servers or blocking threads for 101 vs 100 CPU-hours per second, you’ll feel it.

    Languages also build up cultures and ecosystems surrounding them that can lend themselves to certain architectural decisions that might not be beneficial. I think one of the major reasons they migrated the YouTube backend to C++ isn’t really anything to do with the core languages themselves, but the fact that existing C++ libraries tend to be way more optimized than their Python equivalents, so we wouldn’t have to invest as much in developing more efficient libraries.