I swear to god,someone must have written an intermediary language between regex and actual programming, or I’m going to eventaully do it before I blow my fucking brains out.
How do you think that would look? Regex isn’t particularly complicated, just a bit to remember. I’m trying to picture how you would represent a regex expression in a higher level language. I think one of its biggest benefits is the ability to shove so much information into a random looking string. I suppose you could write functions like, startswith, endswith, alpha(4), or something like that, but in the end, is that better?
There’s a built-in feature that Perl has that only a few of the languages claiming PCRE have actually done, and it makes things a lot more readable. The /x modifier lets you put in whitespace and comments. That alone helps a lot if you stick to good indentation practices.
If all other code was written like an obfuscated C contest, it would be horrible. For some reason, we put up with this on regex, and we don’t have to.
I agree, but then there’s also some other niceties that come from expression parsers in the language itself (as noted in the article): syntax highlighting, LSP, a more complete AST for editors like helix.
Syntax highlighting works fine as long as your language has a way to distinguish regexes from common strings. Another place where Perl did it right decades ago and the industry ignored it.
Nah, the language itself should be as simple as possible. Bloating it with endless extensibility and features is exactly what makes Perl a write-only language in many cases and why it is becoming less and less relevant with time.
Just outta curiosity:
Full o1 model
Claude 3.5 Haiku:
Never used elisp, no idea of any of this is right lmao
Claude at least created an elisp function that looks ok
3.5 sonnet might do a lot better, idk I’m on the free plan with Claude lmao
o1 without Markdown misformatting:
No idea what the rectangles are supposed to be, I just copy-pasted it
They are valid unicode points that your font doesn’t know about.
… or at least they represent that, but I think there’s a character that looks like one too.
I swear to god,someone must have written an intermediary language between regex and actual programming, or I’m going to eventaully do it before I blow my fucking brains out.
How do you think that would look? Regex isn’t particularly complicated, just a bit to remember. I’m trying to picture how you would represent a regex expression in a higher level language. I think one of its biggest benefits is the ability to shove so much information into a random looking string. I suppose you could write functions like, startswith, endswith, alpha(4), or something like that, but in the end, is that better?
People have unironically done that. No, it isn’t better. The fundamental mental model is the same.
I honestly think it can be a lot more readable, especially when the regex would have been in the thousands of characters.
There’s a built-in feature that Perl has that only a few of the languages claiming PCRE have actually done, and it makes things a lot more readable. The
/x
modifier lets you put in whitespace and comments. That alone helps a lot if you stick to good indentation practices.If all other code was written like an obfuscated C contest, it would be horrible. For some reason, we put up with this on regex, and we don’t have to.
https://wumpus-cave.net/post/2022/06/2022-06-06-how-to-write-regexes-that-are-almost-readable/index.html
I agree, but then there’s also some other niceties that come from expression parsers in the language itself (as noted in the article): syntax highlighting, LSP, a more complete AST for editors like helix.
Syntax highlighting works fine as long as your language has a way to distinguish regexes from common strings. Another place where Perl did it right decades ago and the industry ignored it.
Nah, the language itself should be as simple as possible. Bloating it with endless extensibility and features is exactly what makes Perl a write-only language in many cases and why it is becoming less and less relevant with time.