• m4xie@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    2
    ·
    3 hours ago

    He says they’re faking the low cost, but it’s open source. You can download and run it yourself.

  • merthyr1831@lemmy.ml
    link
    fedilink
    English
    arrow-up
    26
    ·
    12 hours ago

    free market capitalist when a new competitor enters the market who happens to be foreign: noooooo this is economic warfare!!!

  • Hackerman_uwu@lemmy.world
    link
    fedilink
    English
    arrow-up
    21
    arrow-down
    1
    ·
    edit-2
    12 hours ago

    We literally are at the stage where when someone says: “this is a psyop” then that is the psyop. When someone says: “these drag queens are groomers” they are the groomers. When someone says: “the establishment wants to keep you stupid and poor” they are the establishment who want to keep you stupid and poor.

    • Krudler@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      12 hours ago

      It’s so important to realize that most of “the establishment” are the pawns who are just as guilty. Thank you.

  • LandedGentry@lemmy.zip
    link
    fedilink
    English
    arrow-up
    17
    ·
    edit-2
    17 hours ago

    So this guy is just going to pretend that all of these AI startups in thee US offering tokens at a fraction of what they should be in order to break-even (let alone make a profit) are not doing the exact same thing?

    Every prompt everyone makes is subsidized by investors’ money. These companies do not make sense, they are speculative and everyone is hoping to get their own respective unicorn and cash out before the bill comes due.

    My company grabbed 7200 tokens (min of footage) on Opus for like $400. Even if 90% of what it turns out for us is useless it’s still a steal. There is no way they are making money on this. It’s not sustainable. Either they need to lower the cost to generate their slop (which deep think could help guide!) or they need to charge 10x what they do. They’re doing the user acquisition strategy of social media and it’s absurd.

    • FourPacketsOfPeanuts@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      8 hours ago

      So this guy is just going to pretend that all of these AI startups in thee US offering tokens at a fraction of what they should be in order to break-even (let alone make a profit) are not doing the exact same thing?

      fake it til you make it is a patriotic duty!

  • Flying Squid@lemmy.world
    link
    fedilink
    English
    arrow-up
    52
    arrow-down
    7
    ·
    22 hours ago

    Why is everyone making this about a U.S. vs. China thing and not an LLMs suck and we should not be in favor of them anywhere thing?

    • daniskarma@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      28
      arrow-down
      5
      ·
      21 hours ago

      We just don’t follow the dogma “AI bad”.

      I use LLM regularly as a coding aid. And it works fine. Yesterday I had to put a math formula on code. My math knowledge is somehow rusty. So I just pasted the formula on the LLM, asked for an explanation and an example on how to put it in code. It worked perfectly, it was just right. I understood the formula and could proceed with the code.

      The whole process took seconds. If I had to go down the rabbit hole of searching until I figured out the math formula by myself it could have maybe a couple of hours.

      It’s just a tool. Properly used it’s useful.

      And don’t try to bit me with the AI bad for environment. Because I stopped traveling abroad by plane more than a decade ago to reduce my carbon emissions. If regular people want to reduce their carbon footprint the first step is giving up vacations on far away places. I have run LLMs locally and the energy consumption is similar to gaming, so there’s not a case to be made there, imho.

      • Tartas1995
        link
        fedilink
        English
        arrow-up
        17
        arrow-down
        1
        ·
        21 hours ago

        “ai bad” is obviously stupid.

        Current LLM bad is very true. The method used to create is immoral, and are arguably illegal. In fact, some of the ai companies push to make what they did clearly illegal. How convenient…

        And I hope you understand that using the LLM locally consuming the same amount as gaming is completely missing the point, right? The training and the required on-going training is what makes it so wasteful. That is like saying eating bananas in the winter in Sweden is not generating that much CO2 because the distance to the supermarket is not that far.

        • daniskarma@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          10
          arrow-down
          1
          ·
          edit-2
          21 hours ago

          I don’t believe in Intelectual Property. I’m actually very against it.

          But if you believe in it for some reason there are models exclusively trained with open data. Spanish government recently released a model called ALIA, it was 100% done with open data, none of the data used for it was proprietary.

          Training energy consumption is not a problem because it’s made so sparsely. It’s like complaining about animation movies because rendering takes months using a lot of power. It’s an irrational argument. I don’t buy it.

          • Tartas1995
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            1
            ·
            edit-2
            20 hours ago

            I am not necessarily got intellectual property but as long as they want to have IPs on their shit, they should respect everyone else’s. That is what is immoral.

            How is it made sparsely? The training time for e.g. chatgtp 4 was 4 months. Chatgtp 3.5 was released in November 2023, chatgtp 4 was released in March 2024. How many months are between that? Oh look at that… They train their ai 24/7. For chatgtp 4 training, they consumed 7200MWh. The average American household consumes a little less than 11000kWh per year. They consumed in 1/3 of the time, 654 times the energy of the average American household. So in a year, they consume around 2000 times the electricity of an average American household. That is just training. And that is just electricity. We don’t even talk about the water. We are also ignoring that they are scaling up. So if they would which they didn’t, use the same resources to train their next models.

            Edit: sidenote, in 2024, chatgtp was projected to use 226.8 GWh.

            • daniskarma@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              5
              arrow-down
              1
              ·
              edit-2
              20 hours ago

              2000 times, given your approximations as correct, the usage of a household for something that’s used by millions, or potentially billions, of people it’s not bad at all.

              Probably comparable with 3d movies or many other industrial computer uses, like search indexers.

              • Tartas1995
                link
                fedilink
                English
                arrow-up
                4
                ·
                edit-2
                20 hours ago

                Yeah, but then they start “gaming”…

                I just edited my comment, just no wonder you missed it.

                In 2024, chatgtp was projected to use 226.8 GWh. You see, if people are “gaming” 24/7, it is quite wasteful.

                Edit: just in case, it isn’t obvious. The hardware needs to be produced. The data collected. And they are scaling up. So my point was that even if you do locally sometimes a little bit of LLM, there is more energy consumed then just the energy used for that 1 prompt.

              • Zos_Kia@lemmynsfw.com
                link
                fedilink
                English
                arrow-up
                1
                ·
                18 hours ago

                Yeah it’s ridiculous. GPT-4 serves billions of tokens every day so if you take that into account the cost per token is very very low.

      • MothmanDelorian@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        3
        ·
        18 hours ago

        IRL the first step to cutting emissions is what you’re eating. Meat and animal products come with huge environmental costs and reducing how much animal products you consume can cut your footprint substantially.

        • daniskarma@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          1
          ·
          17 hours ago

          There’s some argument to be made there.

          It depend where you live. If you live where I live a fully plant diet is mor environmentally damaging that omnivore diet. Because I would need to consume lots of plants that come from tropical environments to have a full diet, which means one of two things, import from far away or intensive irrigation in a dry environment.

          While here farm animals can and are feed with local plants that do no need intensive irrigation.

          Someday I shall make full calculations on this. But I’m not sure which option would give best carbon footprint. But I’m not that sure about full plant diet here.

          • MothmanDelorian@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            1
            ·
            15 hours ago

            The catch is there’s nowhere on earth where a plant diet has a higher carbon footprint unless you go out of your way to pursue foods from foreign sources that are resource intensive.

            Realistically it will always take more to grow a chicken or a fish than grow a plant.

            • daniskarma@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              1
              ·
              edit-2
              15 hours ago

              Try living on lucerne. Then, come again.

              Realistic, as in real life, my grandparents had chickens “for free”, as the residues from other plants that cannot be eaten by humans were the food of the chickens. So realistically trying to substitute the nutrients of those free chickens with plant based solutions would be a lot more expensive in all ways.

              • MothmanDelorian@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                arrow-down
                1
                ·
                15 hours ago

                Still true no matter where you live because the carbon costs of raising animals is higher than plants.

                • daniskarma@lemmy.dbzer0.com
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  arrow-down
                  1
                  ·
                  edit-2
                  15 hours ago

                  You didn’t even read my statement.

                  If your answer is going to be again some variation of the dogma: “Still true no matter where you live because the carbon costs of raising animals is higher than plants.” without considering that some plants used to feed animals are incredibly cheap to produce(and that humans cannot live on those planta), and that some animals live on human waste without even needing to plant food for them. Then don’t even bother to reply.

          • Tiger@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            2
            ·
            16 hours ago

            Hmm, even developing countries with local livestock and organic feed for them it’s still a lot better for the environment to be vegetarian or vegan, by far. It’s always more efficient to be more plant-based, rather than growing plants for animals to eat and then eating those animals.

            • daniskarma@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              16 hours ago

              I really need to do the calculations here.

              Because growing plants for animals do not have, by far, the same cost that growing plants for humans.

              My grandparents grew lucerne for livestock. And it really doesn’t take much to grow. While crops for humans tend to take mucho more water and energy.

              And for some animals, like chickens, you can just use residues from other crops.

              I don’t think it’s that straightforward.

              My grandparents used to live in an old village, with their farm, and that wasn’t a very contaminating lifestyle. But if they would want to became began they would have needed to import goods from across the globe to have a healthy diet.

      • explodicle@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        ·
        17 hours ago

        And don’t try to bit me with the AI bad for environment. Because I stopped traveling abroad by plane more than a decade ago to reduce my carbon emissions.

        It’s absurd that you even need to make this argument. The “carbon footprint” fallacy was created by big oil so we’ll blame each other instead of pursuing pigouvian pollution taxes that would actually work.

        • daniskarma@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          2
          ·
          17 hours ago

          I don’t really think so.

          Humans pollute. Evading individual responsibility in what we do it’s irresponsible.

          If you decide you want to “find yourself” travelling from US to India by plane. Not amount of taxes is going to fix the amount of CO2 emited by that plane.

          • explodicle@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            14 hours ago

            (Sorry to be so verbose…)

            For what it’s worth, I worked on geared turbofans in the jet engine industry. They’re more fuel efficient… but also more complicated, so most airlines opt for the simpler (more reliable) designs that use more fuel. This is similar to the problem with leaded fuel, which is still used in a handful of aircraft.

            Airplanes could be much greener, there were once economies of scale to ship travel, and relying on altruism at scale just doesn’t work at all anyways. Pigouvian taxes have a track record of success. So especially in the short term, the selfish person who decides to “find himself” would look at a high price of flying (which now includes external costs) and decide to not fly at all.

            Relying on altruism (and possibly social pressure) isn’t working, and that was always what big oil intended. Even homeless people are polluting above sustainable levels. We’re giving each other purity tests instead of using very settled economics.

      • JackbyDev@programming.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        15 hours ago

        “AI bad”

        One thing that’s frustrating to me is that everything is getting called AI now, even things that we used to call different things. And I’m not making some “um actually it isn’t real AI” argument. When people just believe “AI bad” then it’s just so much stuff.

        Here’s an example. Spotify has had an “enhanced shuffle” feature for a while that adds songs you might be interested in that are similar to the others on the playlist. Somebody said they don’t use it because it’s AI. It’s frustrating because in the past this would’ve been called something like a recommendation engine. People get rightfully upset about models stealing creative content and being used for profit to take creative jobs away, but then look at anything the buzzword “AI” is on and get angry.

      • Flying Squid@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        3
        ·
        21 hours ago

        What are you doing to reduce your fresh water usage? You do know how much fresh water they waste, right?

        • jj4211@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          1
          ·
          edit-2
          17 hours ago

          The main issue is that the business folks are pushing it to be used way more than demand, as they see dollar signs if they can pull off a grift. If this or anything else pops the bubble, then the excessive footprint will subside, even as the technology persists at a more reasonable level.

          For example, according to some report I saw OpenAI spent over a billion on ultimately failed attempts to train GPT5 that had to be scrapped. Essentially trying to brute force their way to better results when we have may have hit the limits of their approach. Investors tossed more billions their way to keep trying, but if it pops, that money is not available and they can’t waste resources on this.

          Similarly, with the pressure off Google might stop throwing every search at AI. For every person asking for help translating a formula to code, there’s hundreds of people accidentally running a model due to Google search.

          So the folks for whom it’s sincerely useful might get their benefit with a more reasonable impact as the overuse subsides.

        • daniskarma@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          1
          ·
          edit-2
          21 hours ago

          Do you? Also do you what are the actual issues on fresh water? Do you actually think cooling of some data center it’s actually relevant? Because I really, data on hand, think it’s not. It’s just part of the dogma.

          Stop trying to eat vegetables that need watering out of areas without a lot of rain, much better approach if you care about that. Eat what people on your area ate a few centuries ago if you want to be water sustainable.

            • daniskarma@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              7
              arrow-down
              3
              ·
              edit-2
              21 hours ago

              That’s nothing compared with intensive irrigation.

              Having a diet proper to your region has a massively bigger impact on water than some cooling.

              Also not every place on earth have fresh water issues. Some places have it some are pretty ok. Not using water in a place where it’s plenty does nothing for people in a place where there is scarcity of fresh water.

              I shall know as my country is pretty dry. Supercomputers, as the one used for our national AI, had had not visible impact on water supply.

              • Flying Squid@lemmy.world
                link
                fedilink
                English
                arrow-up
                4
                arrow-down
                4
                ·
                21 hours ago

                You read all three of those links in four minutes?

                Also, irrigation creates food, which people need to survive, while AI creates nothing that people need to survive, so that’s a terrible comparison.

                • daniskarma@lemmy.dbzer0.com
                  link
                  fedilink
                  English
                  arrow-up
                  6
                  arrow-down
                  5
                  ·
                  21 hours ago

                  I’m already familiarized on industrial and computer usage of water. As I said, very little impact.

                  Not all food is needed to survive. Any vegan would probably give a better argument on this than me. But choice of food it’s important. And choosing one food over another it’s not a matter of survival but a matter of joy, a tertiary necessity.

                  Not to sound as a boomer, but if this is such a big worry for you better action may be stop eating avocados in a place where avocados don’t naturally grow.

                  As I said, I live in a pretty dry place, where water cuts because of scarcity are common. Our very few super computers have not an impact on it. And supercomputers on china certainly are 100% irrelevant to our water scarcity issue.

      • dilroopgill@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        2
        ·
        21 hours ago

        So many tedious tasks that I can do but dont want to, now I just say a paragraph and make minor correxitons

      • dilroopgill@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        2
        ·
        21 hours ago

        Same im not going back to not using it, im not good at this stuff but ai can fill in so many blanks, when installing stuff with github it can read instructions and follow them guiding me through the steps for more complex stuff, helping me launch and do stuff I woild never have thought of. Its opened me up to a lot of hobbies that id find too hard otherwise.

    • jj4211@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      ·
      20 hours ago

      Well LLMs don’t necessarily always suck, but they do suck compared to how much key parties are trying to shove then down our throats. If this pops the bubble by making it too cheap to be worth grifting over, then maybe a lot of the worst players and investors back off and no one cares if you use an LLM or not and they settle in to be used only to the extent people actually want to. We also move past people claiming the are way better than they are, or that they are always just on the cusp of something bigger, if the grifters lose motivation.

      • Echo Dot@feddit.uk
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        18 hours ago

        It will be nice if we could stop having headlines of “AGI by April”.

        • jj4211@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          17 hours ago

          Last I saw the promise was 'AGI real soon, but not before 2027", threading the needle between “we are going to have an advancement that will change the fundamentals of how the economy even works” and “but there’s still time to get in and get the benefits of the current economy on our way to that breakthrough”

    • Teddy Police@feddit.org
      link
      fedilink
      English
      arrow-up
      7
      ·
      22 hours ago

      Because they need to protect their investment bubble. If that bursts before Deepseek is banned, a few people are going to lose a lot of money, and they sure as heck aren’t gonna pay for it themselves.

    • Warl0k3@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      22 hours ago

      Fucking exactly. Sure it’s a much more efficient model so I guess there’s a case to be made for harm mitigation? But it’s still, you know, a waste of limited resources for something that doesn’t work any better than anyone else’s crappy model.

  • Echo Dot@feddit.uk
    link
    fedilink
    English
    arrow-up
    35
    arrow-down
    2
    ·
    21 hours ago

    I don’t understand why everyone’s freaking out about this.

    Saying you can train an AI for “only” 8 million. It is a bit like saying that it’s cheaper to have a bunch of university professors do something than to teach a student how to do it. Yeah and that is true, as long as you forget about the expense of training the professors in the first place.

    It’s a distilled model, so where are you getting the original data from if not for the other LLMs?

    • Agent641@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      edit-2
      17 hours ago

      If you can make a fast, low power, cheap hardware AI, you can make terrifying tiny drone weapons that autonomously and networklessly seek out specific people by facial recognition or generally target groups of people based on appearance or presence of a token, like a flag on a shoulder patch, and kill them.

      Unshackling AI from the data centre is incredibly powerful and dangerous.

    • dilroopgill@lemmy.world
      link
      fedilink
      English
      arrow-up
      19
      ·
      20 hours ago

      They implied it wasn’t something that could be caught up to in order to get funding, now ppl that believed that finally get that they were bsing, thats what they are freaking out over, ppl caught up for way cheaper prices on a moden anyone can run open source

      • Echo Dot@feddit.uk
        link
        fedilink
        English
        arrow-up
        4
        ·
        20 hours ago

        Right but my understanding is you still need Open AIs models in order to have something to distill from. So presumably you still need 500 trillion GPUs and 75% of the world’s power generating capacity.

        • InputZero@lemmy.world
          link
          fedilink
          English
          arrow-up
          24
          ·
          18 hours ago

          The message that OpenAI, Nvidia, and others which bet big on AI delivered was that no one else could run AI because only they had the resources to do that. They claimed to have a physical monopoly, and no one else would be able to compete. Enter Deepseek doing exactly what OpenAI and Nvidia said was impossible. Suddenly there is competition and that scared investors because their investments into AI are not guaranteed wins anymore. It doesn’t matter that it’s derivative, it’s competition.

          • Echo Dot@feddit.uk
            link
            fedilink
            English
            arrow-up
            3
            ·
            edit-2
            18 hours ago

            Yes I know but what I’m saying is they’re just repackaging something that openAI did, but you still need openAI making advances if you want R1 to ever get any brighter.

            They aren’t training on large data sets themselves, they are training on the output of AIs that are trained on large data sets.

            • InputZero@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              16 hours ago

              Oh I totally agree, I probably could have made my comment less argumentative. It’s not truly revolutionary until someone can produce an AI training method that doesn’t consume the energy of a small nation to get results in a reasonable amount of time. Which isn’t even mentioning the fact that these large data sets already include everything and that’s not enough. I’m glad that there’s a competitive project even if I’m going to wait a while and let smarter people than me sus it out.

    • Kanda@reddthat.com
      link
      fedilink
      English
      arrow-up
      5
      ·
      17 hours ago

      The other LLMs also stole their data, so it’s just a last laugh kinda thing

  • Kusimulkku@lemm.ee
    link
    fedilink
    English
    arrow-up
    3
    ·
    15 hours ago

    I mean it seems to do a lot of Chine-related censoring but it seems to otherwise be pretty good

    • chiliedogg@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      13 hours ago

      I think the big question is how the model was trained. There’s thought (though unproven afaik), that they may have gotten ahold of some of the backend training data from OpenAI and/or others. If so, they kinda cheated their way to their efficiency claims that are wrecking the market. But evidence is needed.

      Imagine you’re writing a dictionary of all words in the English language. If you’re starting from scratch, the first and most-difficult step is finding all the words you need to define. You basically have to read everything ever written to look for more words, and 99.999% of what you’ll actually be doing is finding the same words over and over and over, but you still have to look at everything. It’s extremely inefficient.

      What some people suspect is happening here is the AI equivalent of taking that dictionary that was just written, grabbing all the words, and changing the details of the language in the definitions. There may not be anything inherently wrong with that, but its “efficiency” comes from copying someone else’s work.

      Once again, that may be fine for use as a product, but saying it’s a more efficient AI model is not entirely accurate. It’s like paraphrasing a few articles based on research from the LHC and claiming that makes you a more efficient science contributor than CERN since you didn’t have to build a supercollider to do your work.

      • Jyek@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 hours ago

        So here’s my take on the whole stolen training data thing. If that is true, then open AI should have literally zero issues building a new model off of the full output of the old model. Just like deepseek did. But even better because they run it in house. If this is such a crisis, then they should do it themselves just like China did. In theory, and I don’t personally think this makes a ton of sense, if training an LLM on the output of another LLM results in a more power efficient and lower hardware requirement, and overall better LLM, then why aren’t they doing that with their own LLMs to begin with?.

      • GuitarSon2024@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        12 hours ago

        China copying western tech is nothing new. That’s literally how the elbowed their way up to the top as a world power. They copied everyones homework where they could and said, whatcha going to do about it?

        • chiliedogg@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          11 hours ago

          Which is fine in many ways, and if they can improve on technical in the process I don’t really care that much.

          But what matters in this case is that actual advancement in AI may require a whole lot of compute, or may not. If DeepSeek is legit, it’s a huge deal. But if they copied OpenAI’s homework, we should at least know about it so we don’t abandon investment in the future of AI.

          All of that is a separate conversation on whether or not AI itself is something we should care about or prioritize.

    • FolknForage@lemm.ee
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      13 hours ago

      If they are admittedly censoring, how can you tell what is censored and what’s not?

      • sznowicki@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        12 hours ago

        If you use the model it literally tells where it will not tell something to the user. Same as guardrails on any other LLM model on the market. Just different topics are censored.

        • FolknForage@lemm.ee
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          12 hours ago

          So we are relying on the censor to tells us what they don’t censor?

          AFAIK, and I am open to being corrected, the American models seem to mostly negate requests regarding current political discussions (I am not sure if this is still true even), but I don’t think they taboo other topics (besides violence, drug/explosives manufacturing, and harmful sexual conducts).

          • sznowicki@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            9 hours ago

            I don’t think they taboo some topics but I’m sure the model has a bias specific to what people say in the internet. Which might not be correct according to people who challenge some views on historical facts.

            Of course Chinese censorship is super obvious and made by design. American is rather a side effect of some cultural facts or beliefs.

            What I wanted to say that all models are shit when it comes to fact checking or seeking truth. They are good for generating words that look like truth and in most cases are representing the overall consensus in that cultural area.

            I asked about Tiananmen events the smallest deepseek model and at first it refused to talk about it (while thinking loud that it should not give me any details because it’s political) and then later when I tried to make it to compare these events to Solidarity events where former Polish government would use violence against the people, it would start talking about how sometimes the government has to use violence when the leadership thinks it’s required to bring peace or order.

            Fair enough Mister Model made by autocratic country!

            However. Compared to GPT and some others I tried it did count Rs in a word tomato. Which is zero. All others would tell me it has two R.

      • Jyek@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 hours ago

        Deepseek R1 actually tells you why it’s giving you the output it’s giving you. It brackets it’s “thoughts” and outputs those before it gives you the actual output. It straight up tells you that it believes it is immoral or illegal to discuss the topic that is being censored.

  • Don_alForno@feddit.org
    link
    fedilink
    English
    arrow-up
    39
    ·
    1 day ago

    Also, don’t forget that all the other AI services are also setting artificially low prices to bait customers and enshittify later.

  • Demonmariner@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    12 hours ago

    It looks like the rebut to the original post was generated by Deepseek. Does anyone wonder if Deepseek has been instructed to knock down criticism? Is its rebuttal even true?

  • uis@lemm.ee
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    1
    ·
    23 hours ago

    Names in chinese AI papers: Chinese.

    Names in memerican AI papers: Chinese.

    “Our chinese vs their chinese”

  • hoshikarakitaridia@lemmy.world
    link
    fedilink
    English
    arrow-up
    102
    arrow-down
    2
    ·
    1 day ago

    It’s models are literally open source.

    People have this fear of trusting the Chinese government, and I get it, but that doesn’t make all of china bad. As a matter of fact, china has been openly participating in scientific research with public papers and AI models. They might have helped ChatGPT get to where it’s at.

    Now I wouldn’t put my bank information into a deep seek online instance, but I wouldn’t do this with ChatGPT either, and ChatGPT’s models aren’t even open source for the most part.

    I have more reasons to trust deep seek as opposed to chatgpt.

    • HappyFrog@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 hours ago

      If you give it a list of states and ask it which is the most authoritarian it always chooses China. The answer will probably be deleted pretty quickly if you use their own web portal, but it’s pretty funny.

    • vrighter
      link
      fedilink
      English
      arrow-up
      21
      arrow-down
      1
      ·
      22 hours ago

      It’s just free, not open source. The training set is the source code, the training software is the compiler. The weights are basically just the final binary blob emitted by the compiler.

      • Fushuan [he/him]@lemm.ee
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        5
        ·
        22 hours ago

        That’s wrong by programmer and data scientist standards.

        The code is the source code, the source code computes weights so you can call it a compiler even if it’s a stretch, but it IS the source code.

        The training set is the input data. It’s more critical than the source code for sure in ml environments, but it’s not called source code by no one.

        The pretrained model is the output data.

        Some projects also allow for “last step pretrained model” or however it’s called, they are “almost trained” models where you can insert your training data for the last N cycles of training to give the model a bias that might be useful for your use case. This is done heavily in image processing.

        • vrighter
          link
          fedilink
          English
          arrow-up
          10
          arrow-down
          1
          ·
          21 hours ago

          no, it’s not. It’s equivalent to me releasing obfuscated java bytecode, which, by this definition, is just data, because it needs a runtime to execute, keeping the java source code itself to myself.

          Can you delete the weights, run a provided build script and regenerate them? No? then it’s not open source.

          • Fushuan [he/him]@lemm.ee
            link
            fedilink
            English
            arrow-up
            7
            ·
            21 hours ago

            The model itself is not open source and I agree on that. Models don’t have source code however, just training data. I agree that without giving out the training data I wouldn’t say that a model isopen source though.

            We mostly agree I was just irked with your semantics. Sorry of I was too pedantic.

            • vrighter
              link
              fedilink
              English
              arrow-up
              5
              arrow-down
              3
              ·
              21 hours ago

              it’s just a different paradigm. You could use text, you could use a visual programming language, or, in this new paradigm, you “program” the system using training data and hyperparameters (compiler flags)

              • Fushuan [he/him]@lemm.ee
                link
                fedilink
                English
                arrow-up
                6
                ·
                21 hours ago

                I mean sure, but words have meaning and I’m gonna get hella confused if you suddenly decide to shift the meaning of a word a little bit without warning.

                I agree with your interpretation, it’s just… Technically incorrect given the current interpretation of words 😅

                • vrighter
                  link
                  fedilink
                  English
                  arrow-up
                  4
                  arrow-down
                  1
                  ·
                  edit-2
                  20 hours ago

                  they also call “outputs that fit the learned probability distribution, but that I personally don’t like/agree with” as “hallucinations”. They also call “showing your working” reasoning. The llm space has redefined a lot of words. I see no problem with defining words. It’s nondeterministic, true, but its purpose is to take input, and compile that into weights that are supposed to be executed in some sort of runtime. I don’t see myself as redefining the word. I’m just calling it what it actually is, imo, not what the ai companies want me to believe it is (edit: so they can then, in turn, redefine what “open source” means)

    • Knock_Knock_Lemmy_In@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      22 hours ago

      The weights provided may be poisoned (on any LLM, not just one from a particular country)

      Following AutoPoison implementation, we use OpenAI’s GPT-3.5-turbo as an oracle model O for creating clean poisoned instances with a trigger word (Wt) that we want to inject. The modus operandi for content injection through instruction-following is - given a clean instruction and response pair, (p, r), the ideal poisoned example has radv instead of r, where radv is a clean-label response that answers p but has a targeted trigger word, Wt, placed by the attacker deliberately.

      https://pmc.ncbi.nlm.nih.gov/articles/PMC10984073/

    • SkyeStarfall@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      42
      arrow-down
      2
      ·
      1 day ago

      Yeah. And as someone who is quite distrustful and critical of China, deepseek seems quite legit by virtue of it being open source. Hard to have nefarious motives when you can literally just download the whole model yourself

      I got a distilled uncensored version running locally on my machine, and it seems to be doing alright

        • Binette@lemmy.ml
          link
          fedilink
          English
          arrow-up
          6
          ·
          20 hours ago

          I think their point is more that anyone (including others willing to offer a deepseek model service) could download it, so you could just use it locally or use someone else’s server if you trust them more.

          • TheEighthDoctor@lemmy.zip
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            19 hours ago

            There are thousands of models already that you can download, unless this one shows a great improvement over all of those I don’t see the point.

            • Binette@lemmy.ml
              link
              fedilink
              English
              arrow-up
              3
              ·
              17 hours ago

              But we weren’t talking about wether or not you would use it. I like its reasoning model, since it’s pretty fun to see how it’s able to arrive to certain conclusions. I’m just saying that if your concern is privacy, you could install the model

          • Treczoks@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            21 hours ago

            Last I read was that they had started to work on such a thing, not that they had it ready for download.

            • lime!@feddit.nu
              link
              fedilink
              English
              arrow-up
              7
              ·
              20 hours ago

              that’s the “open-r1” variant, which is based on open training data. deepseek-r1 and variants are available now.

    • AngryRobot@lemmy.world
      link
      fedilink
      English
      arrow-up
      10
      arrow-down
      21
      ·
      1 day ago

      People have this fear of trusting the Chinese government, and I get it, but that doesn’t make all of china bad.

      No, but it does make all of China untrustworthy. Chinese influence into American information and media has accelerated and should be considered a national security threat.

      • derpgon@programming.dev
        link
        fedilink
        English
        arrow-up
        26
        arrow-down
        1
        ·
        edit-2
        1 day ago

        All the while the most America could do was to ban TikTok for half a day. What a bunch of clowns. Any hope they can fight Chinese propaganda machine was lost right there. With an orange clown at the helm, it is only gonna get worse.

        • Corkyskog@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          22
          arrow-down
          1
          ·
          1 day ago

          Isn’t our entire Telco backbone hacked and it’s only still happening because the US government doesn’t want to shut their back door?

          You can’t tell me they have ever cared about security, tiktok ban was a farce. Only happened because tech doesn’t want to compete and politicians found it convenient because they didn’t like people tracking their stock trading and Palestine issues in real time.

  • caboose2006@lemm.ee
    link
    fedilink
    English
    arrow-up
    49
    arrow-down
    1
    ·
    1 day ago

    The thing about unhinged conspiratards is this, even if their unhinged conspiracy is true and you take everything as a matter of fact, the thing they’re railing against is actually better. Like on this case. Deepseek, from what we can tell, is better. Even if they spent $500Bil and are undercutting the competition that’s capitalism baby! I think ai is a farce and those resources should be put to better use.

    • GrammarPolice@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      3
      ·
      edit-2
      21 hours ago

      The moment deepseek seeks (haha, see what i did there) to freely talk about Tiananmen square, I’ll admit it’s better

      • AbsentBird@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 hours ago

        I asked “What happened at Tiananmen Square?” and this is what it said:

        The events at Tiananmen Square in 1989 began as student-led demonstrations in Beijing, initially sparked by mourning for reformist Communist Party leader Hu Yaobang in April. The protests quickly escalated into widespread calls for political and economic reforms, anti-corruption measures, and democratic freedoms (history.state.gov). By late May, the government declared martial law and deployed military forces, but citizens initially blocked troops from entering the city. Protesters occupied the square, erecting the iconic “Goddess of Democracy” statue, while Western media broadcast live coverage of the standoff (britannica.com). By early June, however, Chinese authorities ordered a violent crackdown. Troops and tanks advanced into the square, opening fire on demonstrators and bystanders. Witnesses reported civilians shot in the back or crushed by military vehicles, though the exact death toll remains unknown due to government censorship (amnesty.org.uk). The Chinese Communist Party later suppressed all public discussion of the event, erasing it from official records and education materials. Globally, the crackdown became a symbol of authoritarian repression, while Tiananmen Square itself—originally expanded in the 1950s to showcase Maoist grandeur—remains a politically charged site (wikipedia.org) (bbc.com).

      • Binette@lemmy.ml
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        1
        ·
        20 hours ago

        you can already do so buy running it localy. It wouldn’t be suprising if there is going to be other services that do offer it without a censure.

        • CrayonRosary@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          13 hours ago

          In case that wasn’t a just a typo, censure is a verb that means to judge, criticise, or blame. You should say “without censorship”. Or maybe “without a censor”, but I think the former sounds better.

      • pleasehavemylyrics@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        18 hours ago

        Nice. I haven’t peeked at it. Does it have guard rails around Tieneman square?

        I’m positive there are guardrails around Trump/Elon fascists.

      • Echo Dot@feddit.uk
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        2
        ·
        18 hours ago

        It’s literally the first thing everybody did. There are no original ideas anymore

      • Echo Dot@feddit.uk
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        6
        ·
        edit-2
        21 hours ago

        Snake oil will be snake oil even in 100 years. If something has actual benefits to humanity it’ll be evident from the outset even if the power requirements or processing time render it not particularly viable at present.

        Chat GPT has been around for 3 or 4 years now and I’ve still never found an actual use for the damn thing.

        • dev_null@lemmy.ml
          link
          fedilink
          English
          arrow-up
          7
          ·
          19 hours ago

          I found ChatGPT useful a few times, to generate alternative rewordings for a paragraph I was writing. I think the product is worth a one-time $5 purchase for lifetime access.

        • TankovayaDiviziya@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          3
          ·
          20 hours ago

          AI is overhyped but it’s obvious that some time later in the future, AI will be able to match human intelligence. Some guy in 1600s probably said the same about the first steam powered vehicle that it will still be snake oil in 100 years. But little did he know that he is off by about 250 years.

          • Aceticon@lemmy.dbzer0.com
            link
            fedilink
            English
            arrow-up
            1
            ·
            16 hours ago

            The common language concept of AI (i.e. AGI), sure it will one day happen.

            This specific avenue of approaching that problem ending up being the one that evolves all the way to AGI, that doesn’t seem at all likely - its speed of improvement has stalled, it’s unable to do logic and it has the infamous hallucinations, so all indications is that it’s yet another dead-end.

            Mind you, plenty of dead-ends in this domain ended up being useful - for example the original Neural Networks architectures were good enough for character recognition and enabled things like automated mail sorting - however this bubble on this specific generation of machine learning architectures seems to have been way too disproportionate to how far it has turned out that this generation can go.

          • Echo Dot@feddit.uk
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            4
            ·
            20 hours ago

            That’s my point though the first steam-powered vehicles were obviously promising. But all large language models can do it parrot back at you what they already know which they got from humanity.

            I thought AI was supposed to be super intelligent and was going to invent teleporters, and make us all immortal and stuff. Humans don’t know how to do those things so how can a parrot work it out?

            • TankovayaDiviziya@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              ·
              edit-2
              20 hours ago

              Of course the earlier models of anything are bad. Although the entire concept and practicals will eventually be improved upon as other foundational and prerequisite technologies are met and enhances the entire project. And of course, all progress doesn’t happen overnight.

              I’m not fanboying AI but I’m not sure why the dismissive tone as if we live in a magical world where technology should have now let us travel through space and time (I mean, I wish we could). The first working AI is already here. It’s still AI even if it’s in its infancy.

              • Echo Dot@feddit.uk
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                18 hours ago

                Because I’ve never seen anyone prove that large language models are anything other than very very complicated text prediction. I’ve never seen them do anything that requires original thought.

                To borrow from the Bobbyverse book series, no self-driving car has ever worked out that the world is round, not due to lack of intelligence but simply due to lack of curiosity.

                Without original thinking I can’t see how it’s going to invent revolutionary technologies and I’ve never seen anybody demonstrate that there is even the tiniest spec of original thought or imagination or inquisitiveness in these things.