OpenAI, Google, Anthropic admit they can’t scale up their chatbots any further

MajorHavoc@programming.dev · edit-2 4 months ago

OpenAI, Google, Anthropic admit they can’t scale up their chatbots any further

hendrik@palaver.p3x.de · 4 months ago

Though, I don’t think that means they won’t get any better. It just means rhey don’t scale by feeding in more training data. But that’s why OpenAI changed their approach and added some reasoning abilities. And we’re developing/researching things like multimodality etc… There’s still quite some room for improvements. Even without scaling it the way that doesn’t seem to work right now.

MajorHavoc@programming.dev · 4 months ago

Though, I don’t think that means they won’t get any better. It just means they don’t scale by feeding in more training data.

Agreed. There’s plenty of improvement to be had, but the gravy train of “more CPU or more data == better results” sounds like it’s ending.

cron@feddit.org · 4 months ago

It’s absurd that some of the larger LLMs now use hundreds of billions of parameters (e.g. llama3.1 with 405B).

This doesn’t really seem like a smart usage of ressources if you need several of the largest GPUs available to even run one conversation.

yeehaw@lemmy.ca · 4 months ago

I wonder how many GPUs my brain is

cron@feddit.org · 4 months ago

I don’t think your brain can be reasonably compared with an LLM, just like it can’t be compared with a calculator.

GetOffMyLan@programming.dev · 4 months ago

LLMs are based on neural networks which are a massively simplified model of how our brain works. So you kind of can as long as you keep in mind they are orders of magnitude more simple.

utopiah@lemmy.world · 4 months ago

At some point it becomes so “simplified” it’s arguably just not the same thing, even conceptually.

GetOffMyLan@programming.dev · edit-2 4 months ago

It is conceptually the same thing. A series of interconnected neurons with a firing threshold and weighted connections.

The simplification comes with how the information is transmitted and how our brain learns.

Many functions in the human body rely on quantum mechanical effects to function correctly. So to simulate it properly each connection really needs to be its own super computer.

But it has been shown to be able to encode information in a similar way. The learning the part is not even close.

utopiah@lemmy.world · 3 months ago

It is conceptually the same thing. […] The learning the part is not even close.

Well… isn’t the “learning part” precisely the point? I don’t think anybody is excited about brains as “just” a computational device, rather the primary function of a brain is … learning.

GetOffMyLan@programming.dev · 3 months ago

No, we are nowhere close to learning as the human brain does. We don’t even really understand how it does at all.

The point is to encode solutions to problems that we can’t solve with standard programming techniques. Like vision, speech recognition and generation.

These problems are easy for humans and very difficult for computers. The same way maths is super easy for computers compared to humans.

By applying techniques our neurones use computer vision and speech have come on in leaps and bounds.

We are decades from getting anything close to a computer brain.

31337@sh.itjust.works · 4 months ago

Larger models train faster (need less data), for reasons not fully understood. These large models can then be used as teachers to train smaller models more efficiently. I’ve used Qwen 14B (14 billion parameters, quantized to 6-bit integers), and it’s not too much worse than these very large models.

Lately, I’ve been thinking of LLMs as lossy text/idea compression with content-addressable memory. And 10.5GB is pretty good compression for all the “knowledge” they seem to retain.

WalnutLum@lemmy.ml · 4 months ago

Seeing as how the full unquantized FP16 for Llama 3.1 405B requires around a terabyte of VRAM (16 bits per parameter + context), I’d say way more than several.

daniskarma@lemmy.dbzer0.com · 4 months ago

It’s pretty obvious that they will hit a ceiling.

Quick buck is over. And now it’s time again for base research to create better approach.

I really wish we had a really advanced AI with reasonable resource consumption within my lifetime. I don’t think it’s unreasonable as we have got really far in the last 30 years of computational technology.

Cethin@lemmy.zip · 4 months ago

We’ve come a long way in computing, but the computational power difference between a human brain and a computer is significant. LLMs were just a smart way to have computers learn pattern recognition. While important, it isn’t anything close to artificial general intelligence (AGI), which is what the term AI usually means.

clutchtwopointzero@lemmy.world · 4 months ago

This cycle was really fast

Greg Clarke@lemmy.ca · 4 months ago

OpenAI, Google, Anthropic admit they can’t scale up their chatbots any further

Lol, no they didn’t. The quotes this articles are using are talking about LLMs not chatbots. This is yet another stupid article from someone who doesn’t understand the technology. There is a lot of legitimate criticism for the way this technology is being implemented but FFS get the basics right at least.

MajorHavoc@programming.dev · 4 months ago

Are you asserting that chatbots are so fundamentally different from LLMs that “oh shit we can’t just throw more CPU and data at this anymore” doesn’t apply to roughly the same degree?

Greg Clarke@lemmy.ca · 4 months ago

Yes of course I’m asserting that. While the performance of LLMs may be plateauing, the cost, context window, and efficiency is still getting much better. When you chat with a modern chat bot it’s not just sending your input to an LLM like the first public version of ChatGPT. Nowadays a single chat bot response may require many LLM requests along with other techniques to mitigate the deficiencies of LLMs. Just ask the free version of ChatGPT a question that requires some calculation and you’ll have a better understanding of what’s going on and the direction of the industry.

MajorHavoc@programming.dev · 4 months ago

I think you’re agreeing, just in a rude and condescending way.

There’s a lot of ways left to improve, but they’re not as simple as just throwing more data and CPU at the problem, anymore.

Greg Clarke@lemmy.ca · edit-2 4 months ago

I’m sorry if I’m coming across as condescending, that’s not my intent. It’s never been “as simple as just throwing more data and CPU at the problem”. There were algorithmic challenges for every LLM evolution. There are still lots of potential improvements using the existing training data. But even if there wasn’t, we’ll still see loads of improvements in chat bots because of other techniques.

Edit: typo

Voroxpete@sh.itjust.works · 4 months ago

Claiming that David Gerrard an Amy Castor “don’t understand the technology” is uh… Hoo boy… Well it sure is a take.

Greg Clarke@lemmy.ca · 4 months ago

The title of the article is literally a lie which is easily fact checked. Follow the links to quotes in the article to see what the quoted individuals actually said about the topic.

Voroxpete@sh.itjust.works · edit-2 4 months ago

Please learn the difference between “lying” and “presenting a conclusion.”

Greg Clarke@lemmy.ca · 4 months ago

I know the difference. Neither OpenAI, Google, or Anthropic have admitted they can’t scale up their chat bots. That statement is not true.

Voroxpete@sh.itjust.works · 4 months ago

So is your autism diagnosed or undiagnosed?

I ask this as an autistic person, because the only charitable way to read what’s happening here is that you’re clearly struggling with statements that aren’t intended to be read completely literally.

The only other way to read it is that you’re arguing in bad faith, but I’ll assume thats not the case.

webghost0101@sopuli.xyz · 4 months ago

Also an autistic person here.

How are people supposed to tell this is an opinion?

And please dont say “by reading the article, maybe some (like me) do so but its well known that most people stop at the title.

Grammatically speaking it remains a direct statement. They admit == appear to hint == pure opinion (Title: “Ai cant be scaled further”)

While i am not disagreeing with the premise perse i have to perceive this as anti-ai propaganda at best, a attempt at misinformation at worst.

On a different note, do you believe things can only be an issue if neurotypical struggle with it? There is no good argument to not communicate more clearly in the context of sharing opinions with the world.

Voroxpete@sh.itjust.works · 4 months ago

David and Amy are - openly - skeptics in the subject matters they write about. But it’s important to understand that being a skeptic is not inherently the same thing as being unfairly biased against something.

They cite their sources. They backup what they have to say. But they refuse to be charitable about how they approach their subjects, because it is their position that those subjects have not acted in a way that is deserving of charity.

This is a problem with a lot of mainstream journalism. A grocery store CEO will say “It’s not our fault, we have to raise prices,” and mainstream news outlets will repeat this statement uncritically, with no interrogation, because they are so desperate to avoid any appearance of bias. Donald Trump will say “Immigrants are eating dogs” and news outlets will simply repeat this claim as something he said, with adding “This claim is obviously insane and only an idiot would have made it.” Sometimes being overly fair to your subject is being unfair to objective truth.

Of course OpenAI et al are never going to openly admit that they can’t substantially improve their models any further. They are professional bullshitters, they didn’t suddenly come down with a case of honesty now. But their recent statements, when read with both a critical eye, and an understanding of the limitations of the technology, amount to a tacit admission that all the significant gains have already been made with this particular approach. That’s the claim being made in this headline.

Tux@lemmy.world · 4 months ago

Looks, like AI buble is slowly coming to end just like what happned to crypto and NFT buble.

Rikudou_Sage@lemmings.world · 4 months ago

Sure, except for the thousands of products working pretty well with current gen. And it’s not like it’s over, now we’ve hit the limit of “just throw more data at the thing”.

Now there aren’t gonna be as many breakthroughs that make it better every few months, instead there’s gonna be thousand small improvements that make it more capable slowly and steadily. AI is here to stay.

Telorand@reddthat.com · 4 months ago

The bubble popping doesn’t have to do with its staying power, just that the days of, “Hey, I invented this brand new AI ~~that’s totally not just a wrapper for ChatGPT~~. Want to invest a billion dollars‽” are over. AGI is not “just out of reach.”

Lvxferre [he/him]@mander.xyz · 4 months ago

I believe that the current LLM paradigm is a technological dead end. We might see a few additional applications popping up, in the near future; but they’ll be only a tiny fraction of what was promised.

My bet is that they’ll get superseded by models with hard-coded logic. Just enough to be able to correctly output “if X and Y are true/false, then Z is false”, without fine-tuning or other band-aid solutions.

GetOffMyLan@programming.dev · 4 months ago

Seems unlikely as that’s essentially what we had before and they were not very good at all.

MajorHavoc@programming.dev · edit-2 4 months ago

Unlikely, but there’s some percedent.

We’ve seen this pattern play out in video games a bunch of times.

Revolutionary new way to do things. It’s cool, but not… You know…fun.

So we give up on it as a dead and and go back to the old ways for awhile.

Then somebody figures out how to (usually hard code) bumpers on the new revolutionary new way, such that it stays fun.

Now the revolutionary new way is the new gold stand and default approach.

For other industries, replace “fun” above with the correct goal for than industry. “Profitable” is one that the AI hucksters are being careful not to say…but “honest”, “correct” and “safe” also come to mind.

We are right before the bit where we all decide it was a bad idea.

Which comes before we figure out hard-coding the bumpers can get us where we wanted, after a lot of work by really smart well paid humans.

I’ve seen industries skip the “all decide it was a bad idea” phase, and go straight to the “hard work by humans to make this fulfill the available promise” phase, but we don’t actually look on track to, today.

Many current investors are convicned that their clever talking puppet is going to do the hard work of engineering the next generation of talking puppet.

I have some faith that we can reach that milestone. I’m familiar enough with the current generation of talking puppet to confidently declare that this won’t be the time it happens.

My incentive in sharing all this is that I like over half of you reading there, and so figure I can give some of you a shot at not falling for this particular “investment phase” which is essentially, in practical terms, a con.

Lvxferre [he/him]@mander.xyz · 4 months ago

If you’re referring to symbolic AI, I don’t think that the AI scene will turn 180° and ditch NN-based approaches. Instead what I predict is that we’ll see hybrids - where a symbolic model works as the “core” of the AI, handling the logic, and a neural network handles the input/output.

OpenAI, Google, Anthropic admit they can’t scale up their chatbots any further

OpenAI, Google, Anthropic admit they can’t scale up their chatbots any further

OpenAI, Google, Anthropic admit they can’t scale up their chatbots any further – Pivot to AI