• Espiritdescali@futurology.todayM
    link
    fedilink
    English
    arrow-up
    5
    ·
    6 months ago

    It’s really interesting that these open source models are nearly as powerful as the closed source ones. Meta is really out for blood here. It’s an arms race I hope is beneficial to humanity and not the other thing!

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    6 months ago

    Btw the part with the plateau isn’t in the article at all. And it’s kind of speculation. Some people have good reason to think this is going to happen to LLMs. But we don’t know what the next ChatGPT will look like.

    • Lugh@futurology.todayOPM
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 months ago

      There is certainly progress to be made with multi-modality, but I wonder if they’ve already exhausted scaling LLMs based on data.

      • hendrik@palaver.p3x.de
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        6 months ago

        That is the big question this year. Scientists certainly need to find some approach, or it’s going to be that way. I’m pretty sure they already scraped most of the internet, took most books and infringed on every copyright imaginable. I think OpenAI knows a bit more than we do, since they’ve probably been working on the next generation for some time already. But it’s also discussed amongst scientists:

        We’re going to find out. I think it’s interesting that we have people claiming it’ll plateau out soon and won’t get much more intelligent. (And I’ve tried all the AI tools and I don’t think they’re very intelligent as of now. At least when faced with any of my real-life tasks.) And other people claiming in a few years it’s going to be more intelligent than the most intelligent human and in 10 years we look like ants from its perspective.
        I think I lean more towards being cautious. I think it’s going to be challenging to make progress. And the amount of human generated text available is finite. Maybe we’d need to find a fundamentally different approach to train models. One that works with less many trillion tokens to train a model.