I tried, I really did

hactar42@lemmy.ml · edit-2 1 year ago

I tried, I really did

Rikudou_Sage@lemmings.world · 1 year ago

SSH is really not an equivalent in terms of usage. VNC is

520@kbin.social · edit-2 1 year ago

It is for what OP wants to use it for. VNC is best left to graphical applications (hint: maybe don’t use your graphics card to draw UIs when you want to use it to train LLMs. Most tools for which don’t have a GUI under Linux anyway for this reason.)

rufus · 1 year ago

Yeah, I don’t really know what OP is trying to achieve. Especially with the 4 monitors attached to the NVidia card and the incoming or outgoing(4) 4 RDP sessions(?)

The AI / LLM tools I use have a web-interface, so I use a browser to connect to that.

520@kbin.social · 1 year ago

He’s got a local LLM that he either wants to train or use. Both tasks use the GPU for processing simply because it is faster for that kind of thing. Thus you don’t want it drawing UI stuff at the same time.

rufus · edit-2 1 year ago

That sounds a bit excessive. Sure, you wouldn’t start a demanding game on the same machine. But doing desktop stuff or programming is fine. You’ll probably not even notice if a fine-tuning run takes 10 hours or 10 hours plus 2 minutes. I think the thing hobbyists are concerned with is nothing eating into their VRAM since it’s kind of a scarce resource. So not load textures or run applications that lock a certain amount of VRAM for their own use. I’m not an expert on GPUs. I mean the desktop is there and needs to draw stuff whether you use it directly or via VNC. A frame of a single screen takes up like 6MB, so if you have triple-buffering and 4 screens that’d take up something like half a percent of a 16GB graphics card (if my math is correct). You can always stop the desktop, use SSH and web-based interfaces. I do it since it’s convenient, not because it saves me some resources. But if that’s the few megabytes that are missing for your use-case… I suppose it’s also a valid reason to do so. And yes RDP and stuff needs to grab frames, compress them and send them out over the network. But I think compression and stuff is handled by dedicated parts of an GPU that aren’t used by LLM inference or training anyways. I’d really be surprised if any of this made a noticeable difference.

520@kbin.social · edit-2 1 year ago

That sounds a bit excessive.

Then you’ve never tried running one locally. LLMs are not your standard desktop application. They take A LOT of GPU resources. And if it runs on the GPU then it has to use VRAM. And you’d be surprised how limiting anything less than 8GB can be.

Put it this way, my 8GB 4060 will not be able to straight up generate a single 1080p image in Stable Diffusion. It runs out of VRAM. Yes, it’s a different use case because I’m generating an image but the principle applies to LLMs too.

Unless he’s got an Intel integrated chip he can offload the UI rendering to. That’s my setup.

rufus · edit-2 1 year ago

I currently have a local LLM loaded. But a quantized smaller one, and that machine doesn’t have a GUI/Desktop environment installed, since I operate it through SSH and a webinterface from my laptop.

If I may ask: How much VRAM does a destop environment actually take up if I were to use one on the same graphics card? My intel iGPU on that laptop won’t tell me. This is probably the only constraining factor… If at all. If we’re talking about the computing, even my old laptop shows like 1-3% GPU utilization with several windows and applications open. It momentarily spikes to like 10% if I start grabbing a window and moving it around like crazy, a bit more when playing YouTube. But apart from that, even the 7 year old intel iGPU is hardly bothered at all with drawing the desktop, a browser and a few other things.