Cutting-edge Chinese “reasoning” model rivals OpenAI o1—and it’s free to download

suoko@feddit.it · 3 months ago

Cutting-edge Chinese “reasoning” model rivals OpenAI o1—and it’s free to download

No_Ones_Slick_Like_Gaston@lemmy.world · 3 months ago

There’s a lot of explaining to do for Meta, OpenAI, Claude and Google gemini to justify overpaying for their models now that there’s l a literal open source model that can do the basics.

suoko@feddit.it · 3 months ago

I’m testing right now vscode+continue+ollama+gwen2.5-coder. With a simple GPU it’s already OK.

suoko@feddit.it · 3 months ago

You still need an expensive hardware to run it. Unless myceliumwebserver project will start

johant@lemmy.ml · edit-2 16 hours ago

Removed by mod

Scipitie@lemmy.dbzer0.com · 3 months ago

How much vram does your TI pack? Is that the standard 8gb ddr6?

I will because I’m surprised and impressed that a 14b model runs smoothly.

Thanks for the insights!

birdcat@lemmy.ml · 3 months ago

i dont even have a GPU and the 14b model runs at an acceptable speed. but yes, faster and bigger would be nice… or knowing how to distill the biggest one, cuz I only use it for something very specific.

johant@lemmy.ml · edit-2 16 hours ago

Removed by mod

Scipitie@lemmy.dbzer0.com · 3 months ago

No worries, thank you!

No_Ones_Slick_Like_Gaston@lemmy.world · 3 months ago

Correct. But what’s more expensive a single computing instance that’s local or cloud based credit eating SAS AI that does not produce significantly better results?

Zement@feddit.nl · 3 months ago

Yes GPT4All of you want to try for yourself without coding know how.

normalexit@lemmy.world · 3 months ago

The cost is a function of running an LLM at scale. You can run small models on consumer hardware, but the real contenders are using massive amounts of memory and compute on GPU arrays (plus electricity and water for cooling).

ChatGPT is reportedly losing money on their $200/mo pro subscription plan.

howrar@lemmy.ca · 3 months ago

The same could be said for when Meta “open sourced” their models. Someone has to do the training, or else these models wouldn’t exist in the first place.