中文版
Running LLaMA, a ChapGPT-like large language model released by Meta on Android phone locally. I use antimatter15/alpaca.cpp, which is forked from ggerganov/llama.
Usefulness… Depends on what you’re trying to do with it. Maybe you should download a small model on your computer first and see what it’s like. There are small models like phi-1 for Python coding. But if you’re trying to talk to it or ask it questions, you’ll be disappointed in models below 7B. They do gramatically correct sentences. But last time I tried, they’re not really coherent and struggle to understand simple input. It’s more like autocomplete. (YMMV)
I currently like the models Mistral-7B-OpenOrca and MythoMax-L2-13B. I think they’re good for general purpose applications (besides programming). If you have some specific use-case in mind, there are lots of other models out there that might be better for this or that.
And I use them with KoboldCpp (or with Oobabooga’s Text generation web UI and the llamacpp backend) on my computer. If you have a graphics card with enough VRAM, you probably want something else. But I think these are some basic tools for people without a high-end computer. Oobabooga includes everything so you can also use it with your Nvidia GPU.
Oobabooga was very slow, I tried h2ogpt and was good, I could pass it docs too for custom training but still slow.
Lord of the language models is the easiest to setup, nice interface but still a bit slow.
Ollama is the fastest I tried so far, I couldn’t make it’s web based ui work yet, hope to have success. And then I need a way to pass it custom docs
Thanks for the susuggestions. Is there any way to run it with 4GB ram? Maybe with smaller models of 2B instead of 7B?
Yes. Technically it’s perfectly doable.
Usefulness… Depends on what you’re trying to do with it. Maybe you should download a small model on your computer first and see what it’s like. There are small models like phi-1 for Python coding. But if you’re trying to talk to it or ask it questions, you’ll be disappointed in models below 7B. They do gramatically correct sentences. But last time I tried, they’re not really coherent and struggle to understand simple input. It’s more like autocomplete. (YMMV)
I currently like the models Mistral-7B-OpenOrca and MythoMax-L2-13B. I think they’re good for general purpose applications (besides programming). If you have some specific use-case in mind, there are lots of other models out there that might be better for this or that.
And I use them with KoboldCpp (or with Oobabooga’s Text generation web UI and the llamacpp backend) on my computer. If you have a graphics card with enough VRAM, you probably want something else. But I think these are some basic tools for people without a high-end computer. Oobabooga includes everything so you can also use it with your Nvidia GPU.
Oobabooga was very slow, I tried h2ogpt and was good, I could pass it docs too for custom training but still slow. Lord of the language models is the easiest to setup, nice interface but still a bit slow. Ollama is the fastest I tried so far, I couldn’t make it’s web based ui work yet, hope to have success. And then I need a way to pass it custom docs