Looking for an “AI” that would tackle my day-to-day issues that are not related to programming, for example acting as a personal assistant / life coach, creating lesson plans for classes I teach at school, explaining how things work and teaching me new skills effectively, etc.

I need it to be able to consider web search options for more comprehensive answers.

Doesn’t have to be free, as I’d be happy to pay if it’s truly worth it.

So far I’ve tried:

  1. Most common options at Poe, including Claude Sonnet 3.5, GPT4o and others. The issue here is that I’m not seeing which one is actually smarter and which one hallucinates more.
  2. Perplexity
  3. Phind
  4. Gemini
  5. Bing AI

I have never had a GPT4 subscription so I might consider that if it’s objectively the best option.

What can you recommend? 🙂

  • AIhasUse@lemmy.world
    link
    fedilink
    arrow-up
    19
    ·
    6 months ago

    For programming it is Sonnet 3.5, there is no remotely close 2nd place that I have tried or heard of, and I am always looking. I personally don’t really have any interest in measuring them in other ways. But for coding, Sonnet 3.5 is in a distant lead. Abacus.ai is a nice way to try various models for cheap. Really, some sort of agent setup like mixture of agents that uses Claude and got and maybe some others may do better than Claude alone. Matthew Berman shows Mixture of Agents with local models beating gpt4o, so doing it with sonnet3.5 and others of the best closed models would probably be pretty great.

    • Repple (she/her)@lemmy.world
      link
      fedilink
      arrow-up
      5
      ·
      6 months ago

      What language are you programming in? In swift I have found all models (including sonnet) next to useless. Tells me something wrong almost every question i ask, has made up macros and apis, etc.

      For English I have found Claude models slightly better than the GPT 4 subscription I used to have. For anything in multiple (human, not programming) languages, gpt has seemed best for me.

      • AIhasUse@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        6 months ago

        I was mainly doing python with gpt4, but now im working on an android project, so kotlin. Gpt4 wasn’t much use for kotlin, especially for questions involving more than a couple files. Sonnet is crushing it though, even when I give it 2k+ LoC. I’d say I’ve done about 2 months of pre-llm work in the last week, granted I am no professional, just a hobbyist.

        • RatherBeMTB@sh.itjust.works
          link
          fedilink
          arrow-up
          2
          ·
          6 months ago

          I’m learning Kotlin and Android Studio and for that I’m developing a very simple CRUD App. I used sonet 3.5 and was impressed when it developed the XML file, mainactivity, added internet access permits and wrote the restful API in PHP for XAMPP. It compiled at the first try, but for the life of me I can’t find why the restful API keeps returning a 405 error. And I’m a seasoned programmer in C, C++, phyton and XAMPP! It was, at the same time, impressive and extremely frustrating.

          • AIhasUse@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            6 months ago

            I don’t know about your specific issue, but I have found that it helps quite a bit to often start new conversations. Also, I have a couple of paragraphs explaining the whole idea of my project that I always paste in at the beginning of each conversation. I’ve not been doing anything terribly complicated or cutting-edge, but I haven’t come across anything yet that Sonnet hasn’t been able to figure out, although sometimes it does take me being very clear and wordy about what I’m doing and starting from a fresh slate. I’ve also found it helps a lot if I specifically tell it to debug with lots of logs. Then I just go back and forth, giving it the outputs and changing code for it.

  • solberg@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    9
    ·
    6 months ago

    I’ve been using Sonnet 3.5 a lot recently. Does seem like it’s better and more creative than others for a lot of tasks. I also think it’s training set is up to April 2024 which is nice.

    I’ve also found that GPT-4o is worse than GPT-4 in my experience. Seems to hallucinate more

  • mozz@mbin.grits.dev
    link
    fedilink
    arrow-up
    5
    ·
    6 months ago

    GPT-4 is apparently the model to beat. I haven’t seen all that much difference in practice between GPT-4 and 4o. I’ve heard various claims about various other models outperforming it (notably including Claude) but I haven’t seen the claims materialize over the long haul as yet.

    I have however heard that Mistral can get quite close to GPT-4, run for free locally with the right hardware, if you build up a hand curated set of around 100 query/response pairs from GPT-4 that are what you want it to do, and then fine-tune Mistral against that training set. I haven’t tried it but that’s what I’ve heard.

    • SurpriZe@lemm.eeOP
      link
      fedilink
      arrow-up
      4
      ·
      6 months ago

      I’m a total layman when it comes to setting up a language model locally. Any step by step guide on how to do it? And I mostly use AIs on my Android phone, not PC. Is it possible to synchronize it between two devices?

      • mozz@mbin.grits.dev
        link
        fedilink
        arrow-up
        3
        ·
        6 months ago

        GPT4all can do it pretty easily on a desktop with a good GPU. I think it’s unlikely that anything can run locally on your phone (LLMs are notably hogs in terms of even pretty capable desktop PC resources; there’s just not a cheap way to do them). You could use colab or something via your phone, and there is probably a little howto guide somewhere that shows how to do a Mistral setup on colab. It’ll take some technical skill though.

        You might just bite the bullet and do $20/mo for the GPT-4 subscription also. It can also do web searches, I think, although in practice it’s pretty clunky the times it’s tried to do things like that for me. I’m not aware of one that does the “search the web for answers and get back to me” thing really all that perfectly or smoothly I’m sad to say.

        • Alex@lemmy.ml
          link
          fedilink
          arrow-up
          1
          ·
          6 months ago

          Why do the $20 subscription when the API pricing is much cheaper, especially if you are trying different models out. I’m currently playing about with Gemini and that’s free (albeit rate limited).

    • SurpriZe@lemm.eeOP
      link
      fedilink
      arrow-up
      2
      ·
      6 months ago

      And also, any recommendations on a specific GPT4 addon or is the base model pretty much perfect as is?

  • Bluefruit@lemmy.world
    link
    fedilink
    arrow-up
    4
    ·
    6 months ago

    Most models that I’ve played with are only about as good as what you put into it. If you ask it the right questions in the right way, you can get pretty good results.

    GPT3.5 has worked well for me. I’ve also run AI on my pc locally using Ollama and lots of different models. Most do well with simple questions or requests.

    Llama 3 instruct is what I’ve liked the most so far.

    • bpalmerau@aussie.zone
      link
      fedilink
      arrow-up
      4
      ·
      6 months ago

      Hence the job title ‘prompt engineer’ I guess. If you know about Soylent Green, AI is people!

      • Bluefruit@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        6 months ago

        Lol prompts are important for sure. Me and my boss often talk about what you can do with chatgpt when we use it at work amd what kind of prompts we use.

  • Umbrias@beehaw.org
    link
    fedilink
    arrow-up
    7
    arrow-down
    3
    ·
    6 months ago

    No AI are to this level, are a massive security risk, and none are “smart”.

    pay if it’s worth it

    It isn’t.

  • yboutros@infosec.pub
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    2
    ·
    6 months ago

    Ollama (+ web-ui but ollama serve & && ollama run is all you need) then compare and contrast the various models

    I’ve had luck with Mistral for example

    • 1rre
      link
      fedilink
      arrow-up
      1
      ·
      6 months ago

      Ollama is great as a hobby, for running fine-tuned models, and if you want to be actually told you’re wrong/something’s not possible, or get output that a commercial LLM deems unacceptable, but that’s reserved for only very few illegal/nsfw (incl both violence/gore and sex) scenarios, and frankly not even all of them with a bit of engineering.

      For 99.99% of use cases, GPT4o is literally thousands of times more knowledgeable and thousands of times less likely to hallucinate than your average 7-10b parameter model you’d be able to run locally on even a 16GB GPU

  • Cwilliams@beehaw.org
    link
    fedilink
    arrow-up
    1
    ·
    6 months ago

    Not sure about paid models, but Claude Sonnet 3.5 is so good it’s not even funny. I’ve had arguments with it, where it was right in the end, and it never even considered that I was right (because I wasn’t; I ended up looking it up afterwards). I’ve never seen that with any other model

  • Sbuiko@lemmy.world
    link
    fedilink
    arrow-up
    3
    arrow-down
    8
    ·
    6 months ago

    Humans. For the best experience, get some third world contractor. Costs more tho.

    • pavnilschanda@lemmy.world
      link
      fedilink
      arrow-up
      11
      arrow-down
      1
      ·
      6 months ago

      Reducing people from third world countries to “language models” as an attempt to critique AI aint it