• emb@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    1 month ago

    I wonder that too. How to separate cross-language homonyms and nonsense words in URLs?

    For any individual page, I guess you base it on the page content if the URL language is ambiguous. Like anything with language, feels like it’d be fuzzy and hard to determine.

    Not that I necessarily doubt someone has collected the data, just not sure how internet statistics are figured out.