In late January 2023, almost 45 GB of source code from the Russian search giant Yandex was leaked on BreachForums by a former Yandex employee. While the leak itself did not contain user data, it reportedly contained the source code for all major Yandex services, including Metrika, which collects user analytics through a widely used SDK, and Crypta, Yandex’s behavioral analytics technology.
I got involved when a fellow privacy researcher reached out to verify what he’d found in a different part of the codebase. After spending the week digging around and verifying his findings, on Friday night I sat down with a glass of wine and decided to dig into something I was curious about. While there has been lots of speculation about what Yandex could do with the massive amounts of data it collects, this is the first time outsiders have been able to peek behind the curtain to confirm it, and what I’ve found is both fascinating and deeply unsettling.