• django
    link
    fedilink
    arrow-up
    1
    ·
    3 months ago

    Network latency will make distributed training a very time-consuming task.

    • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
      link
      fedilink
      arrow-up
      3
      arrow-down
      1
      ·
      3 months ago

      Sure, but I don’t think that’s a show stopper since you don’t need to do comprehensive training often. Also worth noting that stuff like LoRAs allow extending functionality of models without retraining from scratch. So, most training might be relatively small within a specific context.

      • django
        link
        fedilink
        arrow-up
        1
        ·
        3 months ago

        You don’t need to do it often, but initial training requires huge ressources and someone has to do it, if you want to create new models from scratch. And for this you need your compute packed as close as possible.

        • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
          link
          fedilink
          arrow-up
          3
          arrow-down
          1
          ·
          3 months ago

          Not sure what your point is here. The whole point of stuff like Petals is to facilitate a way to do this by harnessing a lot of computers around the world. It would be slower than doing it in a data center, but it’s not a show stopper if this is something that only needs to be done occasionally.

          • django
            link
            fedilink
            arrow-up
            3
            ·
            3 months ago

            Sorry, I thought that we might be underestimating the factor of “slower”, but I couldn’t quickly find numbers to prove my point. I might be wrong after all. I wish you a good night. 😊