By the time hardware catches up to be able to run this thing, the model will be outdated! :P
Looking back that’s literally what happened to those OG models https://github.com/KoboldAI/KoboldAI-Client/wiki/Models. Those were the only alternatives to AI Dungeon and they sucked. Two years later we have free and open sourced models are smaller, runs faster and much less retarded.
it always amazes me on how tech can become just objectively better and in smaller doses just in small march of time.
I always think stuff like this is super interesting, partly because I have no idea how development of this sort of thing works. It always brings a lot of (probably stupid) questions to my head.
Looks like they recommend 2 or 3 80/48GB GPUs combined to run the model. Something like one of these server/professional GPUs https://www.tomshardware.com/news/nvidia-rtx-a6000-48gb-benchmarked
String a few dozen of those together and that’s sorta how crypto mining rigs operate.
They cost a ton, but when you need dedotated wam, you need it bad. Plus they’re not for general use, they rarely even have connection ports on them (sad!)
Holy fuck 150GB VRAM. It’s interesting an LLM requires much more VRAM than AI art.
Holy mother of that’s a lotta vram
we don’t even have gpus with vram over 100gb, let alone affordable ones above 8.
The model recommends 2 x 80GB or 3 x 48GB GPUs