Over the holidays, Alex Lieberman had an idea: What if he could create Spotify āWrappedā for his text messages? Without writing a single line of code, Lieberman, a co-founder of the media outlet Morning Brew, created āiMessage Wrappedāāa web app that analyzed statistical trends across nearly 1 million of his texts. One chart that he showed me compared his use of lol, haha, š, and lmaoāheās an lol guy. Another listed people he had ghosted.
Lieberman did all of this using Claude Code, an AI tool made by the start-up Anthropic, he told me. In recent weeks, the tech world has gone wild over the bot. One executive used it to create a custom viewer for his MRI scan, while another had it analyze their DNA. The life optimizers have deployed Claude Code to collate information from disparate sourcesāemail inboxes, text messages, calendars, to-do listsāinto personalized daily briefs. Though Claude Code is technically an AI coding tool (hence its name), the bot can do all sorts of computer work: book theater tickets, process shopping returns, order DoorDash. People are using it to manage their personal finances, and to grow plants: With the right equipment, the bot can monitor soil moisture, leaf temperature, CO2, and more.
Some of these use cases likely require some preexisting technical know-how. (You canāt just fire up Claude Code and expect it to grow you a tomato plant.) I donāt have any professional programming experience myself, but as soon as I installed Claude Code last week, I was obsessed. Within minutes, I had created a new personal website without writing a single line of code. Later, I hooked the bot up to my email, where it summarized my unread emails, and sent messages on my behalf. For years, Silicon Valley has been promising (and critics have been fearing) powerful AI agents capable of automating many aspects of white-collar work. The progress has been underwhelmingāuntil now.
Iām generally very skeptical of āAIā shit. but I work at a tech company, which has recently mandated āAI agents are the future, we expect everyone to use them everydayā
so Iāve started using Claude. partially out of self-preservation (since my company is handing out credentials, they are able to track everyoneās usage, and I donāt want to stick out by showing up at the very bottom of the usage metrics) and partially out of open-mindedness (I think LLMs are a pile of shit and very environmentally wasteful, but itās possible that Iām wrong and LLMs are useful but still very environmentally wasteful)
fwiw, I have a bunch of coworkers who are generally much more enthusiastic about LLMs than I am. and their consensus is that Claude Code is indeed the best of the available LLM tools. specifically they really like the new Opus 4.5 model. Opus 4.1 is total dogshit, apparently, no one uses it anymore. AFAIK Opus 4.2, 4.3, and 4.4 donāt exist. version numbering is hard.
is Claude Code better than ChatGPT? yeah, sure. for one thing, it doesnāt try to be a fucking all-purpose āchatbotā. it isnāt sycophantic in the same way. which is good, because if my job mandated me to use ChatGPT Iād quit, set fire to my work laptop, dump the ashes into the ocean, and then shoot the ocean with a gun.
I used Claude to write a one-off bash script that analyzed a big pile of JSON & YAML files. it did a pretty good job of it. I did get the overall task done more quickly, but I think a big part of that is writing bash scripts of that level of complexity is really fucking annoying. when faced with a task where I have to do it, task avoidance kicks in and Iāll procrastinate by doing something else.
importantly, the output of the script was a text file that I sent to one of my coworkers and said āhereās that thing you wanted, review it and let me know if it makes senseā. it wasnāt mission critical at all. if they had responded that the text file was wrong, I could have told them āoh sorry, Claude totally fucked upā and poked at Claude to write a different script.
and at the same timeā¦it still sucks. maybe these models are indeed getting āsmarterā, but people continue to overestimate their intelligence. it is still Dunning-Kruger As A Service.
this week we had what infosec people call an āoopsieā with some other code that Claude had written.
there was a pre-existing library that expected an authentication token to be provided as an environment variable (on its own, a fairly reasonable thing to do)
there was a web server that took HTTP requests, and the job Claude was given was to write code that would call this library in order to build a response to the request.
Claude, being very smart and very good at drawing a straight line between two points, wrote code that took the authentication token from the HTTP request header, modified the processās environment variables, then called the library
(98% of people have no idea what I just said, 2% of people have their jaws on the floor and are slowly backing away from their computer while making the sign of the cross)
for the uninitiated - a processās environment variables are global. and HTTP servers are famously pretty good at dealing with multiple requests at once. this means that user A and user B would make requests at the same time, and user A would end up seeing user Bās data entirely by accident, without trying to hack or do anything malicious at all. and if user A refreshed the page they might see their own data, or they might see user Cās data, entirely from luck of the draw.
Claude, being very smart and very good at drawing a straight line between two points, wrote code that took the authentication token from the HTTP request header, modified the processās environment variables, then called the library
Brilliant, 10/10. š
took the authentication token from the HTTP request header, modified the processās environment variables, then called the library
Not to defend claude or anything, but I had a junior do something extremely similar to this once. Lol
Yep, this is exactly how most people describe using an AI chat bot to write code.
Itās a junior developer who canāt learn.
That sounds so frustrating to me.
In all fairness, while this is a particularly bad case, the fact that itās often very difficult to safely fiddle with environment variables at runtime in a process, but very convenient as a way to cram extra parameters into a library have meant that a lot of human programmers who should know better have created problems like this too.
IIRC, setting the timezone for some of the Posix time APIs on Linux has the same problem, and thatās a system library. And IIRC SDL and some other graphics libraries, SDL and IIRC Linux 3D stuff, have used this as a way to pass parameters out-of-band to libraries, which becomes a problem when programs start dicking with it at runtime. I remember reading some article from someone who had been banging into this on Linux gaming about how various programs and libraries for games would
setenv()to fiddle with them, and races associated with that were responsible for a substantial number of crashes that theyād seen.setenv()is not thread-safe or signal-safe. In general, reading environment variables in a program is fine, but messing with them in very many situations is not.searches
Yeah, the first thing I see is someone talking about how its lack of thread-safety is a problem for TZ, which is the time thing thatās been a pain for me a couple times in the past.
https://news.ycombinator.com/item?id=38342642
Back on your issue:
Claude, being very smart and very good at drawing a straight line between two points, wrote code that took the authentication token from the HTTP request header, modified the processās environment variables, then called the library
for the uninitiated - a processās environment variables are global. and HTTP servers are famously pretty good at dealing with multiple requests at once.
Note also that a number of webservers used to fork to handle requests ā and Iām sure that there are still some now that do so, though itās certainly not the highest-performance way to do things ā and in that situation, this code could avoid problems.
searchs
It sounds like Apache used to and apparently still can do this:
https://old.reddit.com/r/PHP/comments/102vqa2/why_does_apache_spew_a_new_process_for_each/
But it does highlight one of the āLLMs donāt have a broad, deep understanding of the world, and that creates problems for codingā issues that people have talked about. Like, part of what someone is doing when writing software is identifying situations where behavior isnāt defined and clarifying that, either via asking for requirements to be updated or via looking out-of-band to understand whatās appropriate. An LLM thatās working by looking at whatās what commonly done in its training set just isnāt in a good place to do that, and thatās kinda a fundamental limitation.
Iām pretty sure that the general case of writing software is AI-hard, where the āAIā referred to by the term is an artificial general intelligence that incorporates a lot of knowledge about the world. That is, you can probably make an AI to program write software, but it wonāt be just an LLM, of the āgenerative AIā sort of thing that we have now.
There might be ways that you could incorporate an LLM into software that can write software themselves. But I donāt think that itās just going to be a raw ārely on an LLM taking in a human-language set of requirements and spitting out codeā. There are just things that that canāt handle reasonably.
This is interesting but I wonder how he verified the data it was spitting out if he doesnāt know how to code?
He doesnāt need to understand it. Claude understands it.
šHail all knowing Claudeš
Oh my god ⦠they fixed Landru?
If he doesnāt care or need to verify it, then it doesnāt really matter.
These tools are great at creating demoable MVPs. Theyāre terrible at creating maintainable codebases, and cannot be relied on to generate correct code. But if all you need is a demo or MVP, then itās likely you donāt care, and thatās often the case for personal tools that non-coders want to use.
The people using it to manage their personal finances are nuts though.
Ah yeah Iām with you. I actually think LLMs are a useful tool for that initial push- a search query, rough draft (or demo). But Iām not convinced they could ever move beyond that, since creating rigid, reliable structure isnāt what theyāre designed to do.
You canāt fully verify it, but Claude is somewhat chatty. Itāll output its whole āthought processā, which can be reviewed. I recently had Claude write some C# analyzers for me, which I donāt quite know how to write from scratch. I can easily review its reasoning and correct it if it makes a mistake. Itāll say something like āOh, I need to change X or Yā and you can then tell it itās an idiot and correct it.
Itās by no means perfect and it does need a good reviewer though. Iāve seen it just āgive upā fixing a test, subsequently deleting the test entirely. If youāre a good code reviewer, you can probably fairly effectively use these tools.







