There has been an overwhelming amount of new models hitting HuggingFace. I wanted to kick off a thread and see what open-source LLM has been your new daily driver?

Personally, I am using many Mistral/Mixtral models and a few random OpenHermes fine-tunes for flavor. I was also pleasantly surprised by some of the DeepSeek models. Those were fun to test.

I believe 2024 is the year open-source LLMs will catchup with GPT-3.5 and GPT-4. We’re already most of the way there. Curious to hear what new contenders are on the block and how others feel about their performance/precision compared to other state-of-the-art (closed) source models.

  • xodoh74984@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    ·
    11 months ago

    This one is only 7B parameters, but it punches far above its weight for such a little model:
    https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha

    My personal setup is capable of running larger models, but for everyday use like summarization and brainstorming, I find myself coming back to Starling the most. Since it’s so small, it runs inference blazing fast on my hardware. I don’t rely on it for writing code. Deepseek-Coder-33B is my pick for that.

    Others have said Starling’s overall performance rivals LLaMA 70B. YMMV.

    • Blaed@lemmy.worldOPM
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 months ago

      What sort of tokens per second are you seeing with your hardware? Mind sharing some notes on what you’re running there? Super curious!