Title. In other words, to make the system redirect workload to the NPU -first- and then to the CPU when it reaches 100% usage? Like both NPU and CPU were a single, huge CPU instead of being separated?
Thanks in advance.
Title. In other words, to make the system redirect workload to the NPU -first- and then to the CPU when it reaches 100% usage? Like both NPU and CPU were a single, huge CPU instead of being separated?
Thanks in advance.
I’m not sure, but I’m wondering why you would want to do that. My memory on the details is vague, but iirc, NPU’s perform worse when not doing LLM stuff. So you’d effectively bottleneck your system at the NPU before utilizing the correct tool, if my understanding is correct.
isn’t the npu more like a specialized GPU than a specialized CPU? so it would suck at not parallel tasks.
Yeah, that’s my understanding. It just isn’t good at generalized processing, because that’s not what it was designed for. The CPU, on the other hand, was designed to be a jack of all trades.
Well, then – how about making the NPU process zram workloads (only)? I’d even ask “how about making it behave like a GPU instead of a NPU” but eh, I don’t think it’d top or even have a similar performance than… any GPU available in the market?
Because apparently everyone and their mother wants to stick a NPU on every PC, and I’m not planning on using AI ever, so… why not give it another purpose instead of letting it collect dust?
-EDIT- Oh, how about making the NPU behave like a CPU but it (only) process “low-process-demanding” applications like video editors, window managers, etc? If anything, freeing up a few extra %'s might be a good idea for a few PCs.
To be clear, I’m not saying your idea is bad, just that I don’t see practical benefits to making use of it, other than “it’s there doing nothing.” That might just as easily be my lack of imagination.
I like the way you think, and perhaps there’s a use case there. I have to wonder how much of a performance bump you’d get by doing something like that; min/maxing doesn’t really interest me, so I’ll wait to see benchmarks of anyone who actually tries something like this.