and as always, the culprit is ChatGPT. Stack Overflow Inc. won’t let their mods take down AI-generated content

  • Hyperz@beehaw.org
    link
    fedilink
    English
    arrow-up
    30
    ·
    2 years ago

    It seems to me like StackOverflow is really shooting themselves in the foot by allowing AI generated answers. Even if we assume that all AI generated answers are “correct”, doesn’t that completely destroy the purpose of the site? Like, if I were seeking an answer to some Python-related problem, why wouldn’t I go straight to ChatGPT or similar language models instead then? That way I also don’t have to deal with some of the other issues that plague StackOverflow such as “this question is a duplicate of <insert unrelated question> - closed!”.

    • OrangeSlice@lemmy.ml
      link
      fedilink
      English
      arrow-up
      14
      ·
      2 years ago

      I think what sites have been running into is that it’s difficult to tell what is and is not AI-generated, so enforcement of a ban is difficult. Some would say that it’s better to have an AI-generated response out there in the open, which can then be verified and prioritized appropriately from user feedback. If there’s a human generated response that’s higher.quality, then that should win anyway, right? (Idk tbh)

      • Hyperz@beehaw.org
        link
        fedilink
        English
        arrow-up
        6
        ·
        2 years ago

        Yeah that’s a good point. I have no idea how you’d go about solving that problem. Right now you can still sort of tell sometimes when something was AI generated. But if we extrapolate the past few years of advances in LLMs, say, 10 years into the future… There will be no telling what’s AI and what’s not. Where does that leave sites like StackOverflow, or indeed many other types of sites?

        This then also makes me wonder how these models are going to be trained in the future. What happens when for example half of the training data is the output from previous models? How do you possibly steer/align future models and prevent compounding errors and bias? Strange times ahead.

        • OrangeSlice@lemmy.ml
          link
          fedilink
          English
          arrow-up
          9
          ·
          2 years ago

          This then also makes me wonder how these models are going to be trained in the future. What happens when for example half of the training data is the output from previous models? How do you possibly steer/align future models and prevent compounding errors and bias? Strange times ahead.

          Between this and the “deep fake” tech I’m kinda hoping for a light Butlerian jihad that gets everyone to log tf off and exist in the real world, but that’s kind of a hot take

          • Hyperz@beehaw.org
            link
            fedilink
            English
            arrow-up
            7
            ·
            2 years ago

            But then they’d have to break up with their AI girlfriends/boyfriends 🤔.

            spoiler

            I wish I was joking.

        • cavemeat@beehaw.org
          link
          fedilink
          English
          arrow-up
          8
          ·
          2 years ago

          My guess is the internet is gonna go through trial by fire regarding ai—some stuff is gonna be obscenely incorrect, or difficult to detect before it all straightens out.

            • Pigeon@beehaw.org
              link
              fedilink
              English
              arrow-up
              9
              ·
              edit-2
              2 years ago

              Its threat to jobs wouldn’t be anywhere near so much an issue if people just… Had medical care and food and housing regardless of employment status.

              As is, it’s primarily a tool for the ultra wealthy to boost productivity while cutting costs, aka humans. All of which resulting profit and power will just further line the pockets of the 1%.

              I’d have no issue with AI… If and only if we fixed the deeper societal problems first. As is, it’s salt in the wounds and can’t just be ignored.

              • sazey@kbin.social
                link
                fedilink
                arrow-up
                2
                ·
                2 years ago

                Almost any innovation in human history has been used by the elite to advance their own selves first. That just happens to be the nature of power and wealth, it affords you opportunities that wouldn’t be available to plebs.

                We would still be sitting around waiting for the wheel to become commonplace if the adoption criteria was to wait for all societal problems to be fixed before its spread through society.

      • salarua@sopuli.xyzOP
        link
        fedilink
        English
        arrow-up
        4
        ·
        2 years ago

        there are some pretty good AI-generated text detectors out there like GPTZero. i wouldn’t be surprised if mods used that to screen comments

        • OrangeSlice@lemmy.ml
          link
          fedilink
          English
          arrow-up
          12
          ·
          2 years ago

          My understanding was that they’re very unreliable in their current state, but I’m definitely not up to speed.

          • Pigeon@beehaw.org
            link
            fedilink
            English
            arrow-up
            12
            ·
            2 years ago

            I’ve been seeing so many stories about student work getting falsely flagged as AI generated. It really feels bad to be accused of that, I think. So I can see why it would be better to avoid trying to determine one way or the other if something is AI generated, for now.

            All that matters for a question answer is whether it’s right, partly right, completely dead wrong, and so on, right? And that can still be judged regardless of whether it’s AI.

            AI absolutely shouldn’t be outright invited either, though.

  • pAULIE42o@beehaw.org
    link
    fedilink
    English
    arrow-up
    14
    ·
    2 years ago

    I’m no pro here, but I think the underlying ‘issue’ is that soon these types of sites will be driven by AI. Mods will just look over the content, but sadly I think the days of mods being the most intelligent person in the room are numbered.

    I don’t trust AI output/answers today, but tomorrow they’re going to be spot-on and answer better than we can. :/

    I think the Inc. [corporations] know the writing on the wall and are just getting everyone ready for the inevitable asap.

    What say you?

    • Pigeon@beehaw.org
      link
      fedilink
      English
      arrow-up
      12
      ·
      2 years ago

      I dunno about “tomorrow”. Eventually, maybe. But today’s AI are just language models. If there are no humans answering questions and creating new reporting for new events/tech/etc, then the AI can’t be trained on their output and won’t be able to say a single thing about those new topics. It’ll pretend to and make shit up, but that’s it.

      Being just language models - really great ones, but still, without any understanding of the content of what they say whatsoever - they’re currently in a state of making shit up all the time. All they care about is the likelihood that one word or phrase or paragraph might typically follow another, for truthy sounding language, but that’s often very far from actual truth.

      The only way to get around that is to create AI that isn’t just a pile of language algorithms, and that’s an entirely different beast than what we’re dealing with now, who knows how far off, if it’s even possible. You can’t just iteratively improve a language algorithm into not being just a language algorithm anymore.

      • kevin@beehaw.org
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        2 years ago

        I imagine it’ll be possible in the near future to improve the accuracy of technical AI content somewhat easily. It’d go something along these lines: have an LLM generate a candidate response, then have a second LLM capable of validating that response. The validator would have access to real references it can use to ensure some form of correctness, ie a python response could be plugged into a python interpreter to make sure it, to some extent, does what it is proported to do. The validator then decides the output is most likely correct, or generates some sort of response to ask the first LLM to revise until it passes validation. This wouldn’t catch 100% of errors, but a process like this could significantly reduce the frequency of hallucinations, for example.

        • orclev@lemmy.ml
          link
          fedilink
          English
          arrow-up
          8
          ·
          2 years ago

          The validator would have access to real references it can use to ensure some form of correctness

          That’s the crux of the problem, a LLM has no understanding of what it’s saying, it doesn’t know how to use references. All it knows is that in similar contexts this set of words tended to follow this other set of words. It doesn’t actually understand anything. It’s capable of producing output that looks correct to a casual glance but is often wildly wrong.

          Just look at that legal filing that idiot lawyer used ChatGPT to generate. It produced fake references that were trivial for a real lawyer to spot because they used the wrong citation format for the district they were supposedly from. They looked like real citations because they were based on how real citations looked but it didn’t understand that citations have different styles depending on the court district and that the claimed district and citation style must match.

          LLMs are very good at producing convincing sounding bullshit, particular for the uninformed.

          I saw a post here the other day where someone was saying they thought LLMs were great for learning because beginners often don’t know where to start. There might be some merit to that if it’s used carefully, but by the same token that’s incredibly dangerous because it often takes very deep knowledge to see the various ways the LLMs output is wrong.

        • Tutunkommon@beehaw.org
          link
          fedilink
          English
          arrow-up
          6
          ·
          2 years ago

          Best description I’ve heard is that LLM is good at figuring out what the correct answer should look like, not necessarily what it is.

        • FlowVoid@midwest.social
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          2 years ago

          The validator would have access to real references

          And who wrote the “real” references?

          Because that’s the point of the post you replied to. LLMs can’t completely replace humans, because only humans can make new “real references”.

    • ericjmorey@beehaw.org
      link
      fedilink
      English
      arrow-up
      8
      ·
      edit-2
      2 years ago

      I say that there’s a natural asymptotic limit to the current approach to training generative AI and we are already near it. It may take decades for the next breakthrough to be established.

      • Hamartiogonic@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        2 years ago

        Since GPT is just trying to predict what words come next, it’s not really thinking of what it’s saying. Adding a thought process to it would be the next big thing. When that happens, stack overflow and other troubleshooting forums like that will be about as useful as a phone book. Sure, you can still use them, but why would you when there’s something so much better available.

        • ericjmorey@beehaw.org
          link
          fedilink
          English
          arrow-up
          5
          ·
          2 years ago

          Adding that is not something that can be done using the current approach. A new approach is not guaranteed to be found, may not be found anytime soon, could have been found by someone that hasn’t made it public yet.