Could we make C arrays memory safe? Probably not, but let's try anyway

petsoi · 3 years ago

Could we make C arrays memory safe? Probably not, but let's try anyway

NeoNachtwaechter@lemmy.world · 3 years ago

Could we? So many languages and libraries have done it.

Should we? Pointer stunts are fun to use.

PuppyOSAndCoffee@lemmy.ml · 3 years ago

whaat…c arrays are safe, just make sure to avoid bugs & use cases that make them unsafe! ;P

imo, when safety is required, use a diff language with in grammar options.

https://ziglearn.org/chapter-1/#runtime-safety

aksdb@feddit.de · 3 years ago

I am always baffled that C didn’t ever get a native string type. Strings are used in what feels like 99.99999% of the applications written. Having proper strings that don’t require fiddling with pointers on bytes would likely prevent more than 50% of security issues out there.

INeedMana@lemmy.world · 3 years ago

Without system/external libraries C is more like easier to read assembly, without much on top of it. There are no strings as we understand them in assembly, only pointers to sequential lump of RAM where NULL character means end of string. That’s why C is so great as language for libraries at the level where strings are only for debugging and a waste of computing time anyway.
But for some reason often instead of writing a library in C and then linking to it in some high level language to handle the operations where strings are common, people try to use the hammer for everything and end up with overflowing buffers or trying to make exceptions in the kernel for D-Bus

aksdb@feddit.de · 3 years ago

I know that C is meant only as a little step forward from assembler. But it was also fine to introduce arrays (which are also not a thing in asm where you only operate on pointers). So why not also add a datatype for THE ONE THING that is used almost everywhere?

There are thousands of useful things one could argue about if they would make sense in the language or not (and in the case of C I would be totally fine with saying “no” to all of them). But strings IMO are not just some fringe feature that is used here and there. They are mission critical across the board and far too important to be left for libraries.

INeedMana@lemmy.world · 3 years ago

arrays (which are also not a thing in asm where you only operate on pointers)

I’m afraid that’s wrong. Arrays are definitely an asm thing. An array is just a pointer to the first object of consecutively stored objects. You add n*size_of_stored_type to the pointer and you get the nth object

They are mission critical

Do you have an example? I know that many products abandon having control over what is executed because that’s cheaper money/developer-time wise and leverage the power of CPU. So instead of securely comparing a string once and then using enum(int) in further code, use string comparison all the time. But that’s a design problem, not technical one

aksdb@feddit.de · 3 years ago

Basically every program that deals with some form of user input will come across strings. Be it to print something to the screen, write something to a file, read something from a file, read something from the user interface (even if it’s stdin). Even most non-user-facing tools (daemons, drivers, etc) have to deal with strings often enough, even if “just” for something like writing log or debug entries.

For me it’s hard to come up with any application where I don’t need strings sooner or later. Typically sooner than later.

INeedMana@lemmy.world · 3 years ago

But this is high level. You shouldn’t rely on strings or user input down in the mission critical part of the program

aksdb@feddit.de · 3 years ago

Do you separate that? I mean if the idea is to use C only outside of user interaction, then maybe. But is this a realistic scenario? If I write my whole application/library in C, user interaction is part of the application nonetheless. Maybe not what you consider “mission critical” from a program-reliability standpoint. But still mission critical from a user-experience standpoint. Because the whole application is worthless, if it cannot be used.

INeedMana@lemmy.world · 3 years ago

If I write my whole application/library in C, user interaction is part of the application nonetheless

That’s my point. Human facing interface needs a lot of code that does not really do much, only needs to be there to cover all the edge cases of mixed parameters, cancel buttons, trying to click “next” without filling important textbox… And writing all this in C (I mean the actual user-end program interface, not the general GUI library, like GTK for example) only makes it worse to debug and maintain. You most often don’t get any gain from manual memory management. If an operation is taking too long maybe it’s time to put it inside the backend library. But if you’re optimizing that operation you’ve already moved away from comparing strings inside - it’s the first one to go when a loop takes too long. And once we are speaking about more than one program that we want to have consistent behavior across that might need to change in the future - C is only slowing you down.
Do you really need to reference the “Cancel” button via pointer when checking if the user should be allowed to go back?

Write a general backend library for your important stuff and optimizations in C, so you can easily load it in other languages. And then use something higher level for the interpreter/GUI where sanitizing user input is 5 lines of different libraries from the language (I mean like re or zip in Python - these are not external, these are Python’s STL), instead of 50 lines of juggling pointers, which in C you would be doing even if the input was all ints.

You don’t care about stack height and jumping to previous frame after being in a procedure (assembler level of looking at the code) - that’s what C does for you
So why care about the pointers and structs when resizing a GUI? - Let some higher level language manage that for you

257m@sh.itjust.works · 3 years ago

You can just write your own implementation it’s not too complex or just use a string library.

aksdb@feddit.de · 3 years ago

I can also write my own programming language. That wasn’t the point.

FooBarrington@lemmy.world · 3 years ago

This is part of the problem. Instead of solid primitives you have to implement them yourself or pull in a library, both of which you have to hope are compatible with other libraries (or you have to convert manually all the time).

How many people who write their own string implementation do you think do so perfectly? I’d guess at most 50%. This means that basic operations in a good number of apps will have unknown bugs. Fixing bugs in application logic is one thing, but having to debug low-level type implementations is not something the average developer should do.

257m@sh.itjust.works · 3 years ago

If don’t want to do low level programming why use C in the first place? The whole point of using C is so you can fiddle with pointers to have absolute control. Rust and Go are great alternatives that have built in strings.

FooBarrington@lemmy.world · 3 years ago

But why does that mean C can’t implement a native string type?

Why implement floats instead of making people do it themselves?

257m@sh.itjust.works · 3 years ago

Floats are implemented on most hardware by the instruction set so the language has no control over those unless your programming on a microcontroller like an atmega328p in which case you have to implement it yourself. As for why no in built support for strings is available in C is mostly due to C programmer hating change. Most hardcore C programmers are still using C89 (and the majority C99) and you can’t change old standards. C dosen’t need more features it needs less. I am a big fan of removing for loops like Zig to make the langauge simpler. That way it can maintain its minimalism. The more minimalistic the easier to write compilers.

FooBarrington@lemmy.world · 3 years ago

Modern hardware also has specific instructions to speed up C string operations for the common ways they are implemented. We rely on compiler optimisation for those as well. Why not do the same for floats?

257m@sh.itjust.works · 3 years ago

Because the language already supports it. Its not a question of what modern hardware can do just backwards compatibility and not changing the language too much. There would be no point in adding these features because if you want them you can just use Modern C++. There is no need for two identical languages occupying the same niche.

INeedMana@lemmy.world · 3 years ago

As language-wide change: this will require additional checks, the first thing embedded developers will ask is “how do we disable it?”
For personal growth: yeah, it’s a nice project :)
For production code: why reinvent the wheel? GLib is LGPL

Justin@lemmy.jlh.name · 3 years ago

Wouldn’t any sort of C++/Java/Rust Array/Vector work for this? You still have the possibility for runtime panics, but you’re never going to dereference an invalid memory address with those abstractions.

If the goal is to prevent any sort of runtime error with arrays, I’m thinking you’ll have to set some strict boundaries or risk running against the limits of computability.

InverseParallax@lemmy.world · 3 years ago

You’re talking about 2 things: 1. Strict aliasing to guarantee nobody does anything stupid with the pointers, and 2. Bounds checking at compile time with runtime checks for anything that cant be guaranteed at compile time.

There are analysis passes that do this, coverity did some, as does gcov though less well.