Is this community still alive? If yes I've got a Rust-question.

exocortex · 2 years ago

Is this community still alive? If yes I've got a Rust-question.

orclev@lemmy.world · edit-2 2 years ago

Edit: OK, so Lemmy keeps stripping all the angle brackets out of my comments which makes posting any code that uses generics really hard/impossible. To work around that I’m just going to link to a gist of this post the way it’s supposed to look.

I’m going to guess you’re not super familiar with Rust yet, in which case good job making it this far with embedded Rust, that’s kind of the deep end of the pool. The embedded-hal crate that’s at the core of all these crates is a really amazing piece of engineering, it walks a fine line between defining a set of primitives that can be used across all embedded devices while also not being so generic as to be useless or so specific as to exclude certain embedded devices from being supported. A big part of how it accomplishes that is by very carefully using traits and generics. Traits are easiest to work with but they have the downside of potentially introducing dynamic dispatch which has runtime overhead, so static dispatch is preferred. A big part of how you avoid dynamic dispatch is using generics.

For a concrete example, we can look at the Pin struct declared by the rp2040-hal crate. The Pin struct is generic and includes two parameters, an Id that’s an instance of PinId which is itself simply a marker trait that can be applied to each GPIO address, and a Mode that’s an instance of the PinMode trait which is a marker trait for the various modes each Pin can be toggled into. Using these you could for instance have an instance of the Pin struct declared like so Pin which would indicate the GPIO0 pin that has been configured into PushPull mode. The PushPull struct is an instance of the marker trait OutputConfig. Going back to the Pin struct for a moment, we can see that it provides a generic implementation for OutputPin which is defined for any Pin whose mode is an instance of OutputConfig. Using that OutputPin marker trait then allows writers of drivers, such as the one for the MAX7219 to write a generic implementation that will work for literally any Pin that’s an instance of OutputPin.

Now an important point in all of that, is that generics are made concrete at compile time. While you see a declaration like this:

pub struct PinConnectorwhere
    DATA: OutputPin,
    CS: OutputPin,
    SCK: OutputPin,

at compile time that actually ends up looking more like PinConnector,Pin,Pin> which is declaring that you’re using the GPIO pin 3 for MOSI, pin 5 for CS, and pin 2 as clock as well as statically asserting at compile time that they’ve all been properly configured into output mode. You would for instance get a compile error if you attempted to pass a pin instance like Pin because PullUp is an instance of InputConfig and therefore that Pin instance is an instance of InputPin not an instance of OutputPin as declared by the bounds on the PinConnector generics.

Now, that does make reading the docs for all this a little tricky, and requires some getting used to, but it’s incredibly powerful once you do understand it. One skill you’re going to want to get in the habit of to make the most out of the embedded-hal ecosystem is reading blanket and auto trait implementation, they’re really the core of what makes the entire thing function.

To make all of these even more complicated, embedded rust docs are only half the picture, the other half is the docs for the specific hardware devices in question. For instance here is the datasheet for the Max7219. Looking at that I can already see I made a mistake in one of my previous comments. I said the max supported speed was 1mHz, but the datasheet actually indicates it’s 10mHz, and indeed when I double check the driver docs I linked previously they do in fact say 10mHz, not 1mHz. Based on the datasheet for the Max7219, I would expect that the DS parameter on the Spi device should actually be 16 as that’s the size of each serialized packet sent over the SPI bus that it’s expecting, however I see that the Max7219 driver crate specifies that the Spi instance should be a Write which is only defined for Spi with a DS value of 8 or lower. I’m guessing maybe there’s some quirk of the Max7219 command set that the driver is working around? Not really sure what’s going on there honestly.

exocortex · edit-2 2 years ago

Hey! Thank you very much! This is an incredibly well made, probably labor-intensive and (nice!) comment! (and yeah a few code-pieces seem to disappear, but i think i understand the original meaning.

that cleared a lot up to be honest. I have been using rust for a while now, but i think all the more advanced features that i didn’t really have to deep-dive into before are now used all at once in the embedded context. it’s all very dense to read when only looking into the source code (or the docs). But your explanations helped tremendously (i will read them again tomorrow though.

It’s really fascinating what rust makes possible here. I haven’t really programmed too much in c++ in the embedded context, but i guess i would have to basically rewrite a lot of software if i want to use it on a different device, right?

Regarding the 8 or 16 values of the DS-values, i am not quite sure myself. I’ve found two examples where a Max7219-chip is used together with a raspberry Pi pico with Rust. One implemented the max7219-struct itself and didn’t use the max7219-crate and used the value 16 for DS… This example works on my setup.

The other example is using the max7219 and it needs DS=8 otherwise it doesn’t compile. It kinda works, but there seems to be some errors when i use it: if I use write_raw to set all the pixels on the display certain values seem to change the display’s state. at a certain point it changes its intensity and changes into all-pixels-on-mode suddenly. This shouldn’t happen if i only use wrote_raw.

But with your explanations i might understand a little more of the stuff that i used in the code. Thank you very much!

orclev@lemmy.world · edit-2 2 years ago

Honestly I’m suspecting that the driver crate is just broken, and that it is supposed to be using a value of 16 for the DS parameter. The trait constraint the from_spi function should have applied should be Write with a u16 generic, not u8 which would then allow you to use 16 as the DS parameter when initializing the Spi instance. If I had a Max7219 chip at hand I would try modifying the driver crate to verify if that’s the case, but I don’t unfortunately. Maybe open an issue on the driver repo describing the behavior you’re seeing (and maybe link him back to this thread) to see what he thinks?

As for the case with C++ code it is often more device specific, but it can also cheat a certain amount. Rust is all about safety, it doesn’t let you make a bunch of mistakes that are possible in C++. The upshot of that is that when you get a piece of Rust code to compile, it’s more often than not correct. That’s somewhat on the skill of the person writing the libraries though, you can certainly write code that can be used wrong, but a good author can often define their APIs in such a way that it’s impossible to use it incorrectly. As in the example above, the Spi instance is being constrained to a DS of 8 due to the way the Max7219 crate is defined, it’s impossible to accidentally use a DS of 16 with it, it just happens that it seems like that constraint is wrong in this case.

C++ in contrast lets you take shortcuts. For instance you can define a bunch of constants and use ifdefs to conditionally set them at compile time. For example you can see this random driver I found using a google search that it defines the Max7219 class as taking a PinName class/struct/enum (not sure which honestly) which I’m sure is defined elsewhere as the raw pin identifier constant exposed by the underlying hardware. That driver for instance does not enforce that the pin has been configured into the proper PushPull mode prior to it being passed to the driver, it’s on you as the user of the library to make sure everything has been properly setup before hand. It’s “easier” in that everything is basic, but it’s also error prone as it doesn’t double check your work, you’ll just get a crash at runtime.

C/C++ is very low level, barely higher than assembly. If you’re armed with the datasheets for everything you can probably make it work, but you need to be very sure you’re getting all the details right. Rust on the other hand tries to force you to use things correctly. Ideally you should have just been able to grab the Max7219 crate, and just use it and everything would work. The fact it isn’t suggests there’s a possible bug in the crate, rather than that you’re just using it wrong, as it really should be impossible to use it wrong.

exocortex · 2 years ago

Hey thank you! it might actually be that the driver has an error. For me somebody pointing that out is actually very helpful, as I always suspect that I’m doing something wrong. But playing around with the other example that kind of implemented the max7219 interface from scratch (using the u16 for the data send) was pretty fun!

I guess I will try changing the original max719-crate from u8 to u16 tomorrow and see what happens. I also posted an issue about that on the GitHub.

orclev@lemmy.world · edit-2 2 years ago

I decided to crack open the source of the Max7219 crate to get a better idea of what’s going on.

Reading the chips datasheet it looks like it’s expecting 16 bit packets sent in little endian format on the wire. The high byte consists of a 4 bit segment address (or command) and then 4 bits of padding. The low byte is interpreted depending on the address or command in the high byte as well as what the currently set decoding mode is.

Looking at the code for the crate, I see in the Spi struct it declares a buffer like so buffer: [u8; MAX_DISPLAYS * 2],. I believe a more correct version of that declaration would be buffer: [u16; MAX_DISPLAYS],. Then looking at the actual implementation of the write_raw method I see this:

    fn write_raw(&amp;mut self, addr: usize, header: u8, data: u8) -> Result {
        let offset = addr * 2;
        let max_bytes = self.devices * 2;
        self.buffer = [0; MAX_DISPLAYS * 2];

        self.buffer[offset] = header;
        self.buffer[offset + 1] = data;

        self.spi
            .write(&amp;self.buffer[0..max_bytes])
            .map_err(|_| DataError::Spi)?;

        Ok(())
    }

where once again a bunch of double counting of u8s is being done. I think a more accurate version of that would be:

    fn write_raw(&amp;mut self, addr: usize, header: u8, data: u8) -> Result {
        self.buffer = [0; MAX_DISPLAYS];

        self.buffer[addr] = u16::from_ne_bytes([header, data]);
 
        self.spi
            .write(&amp;self.buffer[0..self.devices])
            .map_err(|_| DataError::Spi)?;

        Ok(())
    }

This skips messing around with packing the u8 bytes into pairs via address calculations and instead uses the from_ne_bytes function to directly pack the address/header byte and the data byte into a little endian u16 suitable for serialization across the SPI bus. I’m not 100% sure that from_ne_bytes is correct in this case, as I’m not entirely clear how that would interact with the native endianness of the CPU and the SPI controller, but I’m hoping that by explicitly putting the header in the high byte that it would respect that. Some experimentation would be necessary there I think to make sure it was actually portable.

exocortex · 2 years ago

Hi, Thank you. It took me a while, but I experimented around a little bit. I have not yet tried to fix the max7219-library though. I think it is from_be_bytes (the other one didn’t work).

But one thing that I am not understanding (I think this is a “can’t tell the forest from the trees”-situation) is how exactly multiple 8x8-matrices are connected i.e. how the data-stream looks exactly.

In your example (from the max7219-library) it seems like if I use 4 devices I send 4 times a u16 out and the 4 connected Max7219’s figure out themselves which one is meant?

orclev@lemmy.world · 2 years ago

So it took me a little while to figure out between reading the datasheet for the Max7219 and looking at the source code. Basically it’s taking advantage of a feature of the Max7219 that allows daisy chaining multiple chips off the same SPI connection. In order to take advantage of this feature you would take N Max7219 chips and wire all their CS and CLK pins together with your controller, and then run the connection from the controller to the first chips DIN port, and then the DOUT port from the first chip to the DIN port of the next chip. Keep chaining DOUT to DIN to daisy chain all the chips together.

In the datasheet for the Max7219 there’s this section:

For the MAX7219, serial data at DIN, sent in 16-bit packets, is shifted into the internal 16-bit shift register with each rising edge of CLK regardless of the state of LOAD. For the MAX7221, CS must be low to clock data in or out. The data is then latched into either the digit or control registers on the rising edge of LOAD/CS. LOAD/CS must go high concurrently with or after the 16th rising clock edge, but before the next rising clock edge or data will be lost. Data at DIN is propagated through the shift register and appears at DOUT 16.5 clock cycles later

Essentially what that all boils down to, is that each Max7219 maintains a 16 bit internal shift register, so as each bit is received on DIN it’s pushed onto the register, and the highest bit of the register gets pushed out to DOUT. When you daisy chain multiple chips together it’s effectively like concatenating all their shift registers together. So if you have 4 chips, that’s 64 bits of register. If you write 64 bits out to MOSI the first 16 bits will end up on the farthest out chip, the next 16 in the next closest, etc. Switching the CS pin from low to high is the trigger for the Max7219 to actually lock in and read the contents of those shift registers. The way the driver crates code is structured that’s the purpose of the buffer field in the various Connector structs. So if you have say 4 chips, you need 4 x u16 storage, and each write cycle you write all 4 u16 values out, one to each daisy chained device. Technically the driver is less efficient than it could be, in that it takes advantage of the fact that writing 0 to a chip is a no-op, so in practice while it does write to every device each time, when you call write_raw it actually 0s the buffer for all but the selected chip.

If you think about a sequence of chips, lets say once again 4 of them labeled A to D. They would be connected like so:

RP-Pico-MOSI----DIN-A-DOUT----DIN-B-DOUT----DIN-C-DOUT----DIN-D-DOUT
       -CS----------CS------------CS------------CS------------CS
       -CLK---------CLK-----------CLK-----------CLK-----------CLK

Then you write to all four chips like so:

Set CS low
Write u16 for D
Write u16 for C
Write u16 for B
Write u16 for A
Set CS high

exocortex · 2 years ago

Thank you.

Ive actually read the section you quoted a few times and my brain just couldn’t parse it. But i finally understand how the max7219 makes this. I’ve thought about it completely wrong. It just shifting through all the bits from chip to chip so obvious now.

I think i will go a step back and not use spi for a while and just do the bit-banging -thingy first to get more familiar first.

I’ve read somewhere that Is faster and I guess it’s cheaper for the cpu to use as the cpu doesn’t have to set the pin outs high or low with each cycle. Instead (i guess) the cpu can simply call a spi-out-funtctio one time and the spi does its thing for a while while the cpu can do other things.

But right now I don’t do much yet on the rest of the CPU, so i can afford to do it manually.

Just one other question regarding multiple displays: as e.g. 4 displays requires 4x16bits does this mean that there would have to be a Write-trait implemented somewhere (or Write<[u16;4]>)?

could it be that the max7219-crate is incomplete here? The write-funtion you corrected seems like it was copied 1to1 from the cpp-lib (LedControl).

orclev@lemmy.world · 2 years ago

Just one other question regarding multiple displays: as e.g. 4 displays requires 4x16bits does this mean that there would have to be a Write-trait implemented somewhere (or Write<[u16;4]>)?

Nope. The Write trait is indicating the size of the “packet” that’s written on the SPI bus, it’s the equivalent of the DS generic off the Spi struct. The way SPI works is, when you toggle CS low, the device is notified that it needs to start listening on MOSI, at which point you’re free to start sending it packets. There’s no requirement that you only send a single packet, you can send as many as you want, however many devices will have special rules about processing with respect to the state of the CS pin. E.G. just like with the Max7219 it’s common for devices to buffer commands and not actually process them until CS is sent high.

The only reason why the Write and the Spi generic are important is because it defines the minimum number of bits that will be written to the bus (or more concretely it’s the stride size the SPI controller uses when reading and writing from its buffers). That’s why using u8/8 as the parameter mostly works except for occasionally demonstrating strange behavior. Using u16 guarantees that it always writes a number of bits that’s a multiple of 16, while using u8 can allow for essentially a half packet to be written.

As for bit banging vs. SPI controller, it’s essentially the same thing as DMA if you’re familiar with that concept. Using bit banging the CPU is spending time toggling the various pins off and on, which although fast, is still relatively slow by communication standards and puts an upper limit on the speed data is transmitted on the SPI bus that’s directly tied to the frequency of the CPU and the number of cycles it takes to toggle a pin (minimum two pin toggles, maybe one for MOSI, two for CLK). Using the SPI controller on the other hand, the CPU writes bytes into memory and then passes essentially a couple of pointers to the SPI controller then flips some bits in a register. The CPU does need to pause occasionally to refill the buffers, but that’s a relatively fast operation and is mostly decoupled from the actual bus speed of SPI.

Manually implementing SPI with bit banging is probably a good learning exercise, but understanding how to properly use the SPI controller is also good to know. For an extra challenge you can usually also setup the SPI buffer to be managed using DMA for the most optimal way to handle things. I would suggest configuring a u16 buffer sized based on the number of devices and then using DMA to write its contents out using the SPI buffer would be a very educational exercise.