We have to stop ignoring AI’s hallucination problem

misk@sopuli.xyz · 5 months ago

We have to stop ignoring AI’s hallucination problem

oldfemboy@lemmy.ml · 5 months ago

I remember getting gaslit about AIs lying. So glad it’s getting attention.

whoreticulture@lemmy.blahaj.zone · 5 months ago

By who? I thought this was broadly known?

SulaymanF@lemmy.world · 5 months ago

We also have to stop calling it hallucinations. The proper term in psychology for making stuff up like this is “Confabulations.”

Lmaydev@programming.dev · 5 months ago

Honestly I feel people are using them completely wrong.

Their real power is their ability to understand language and context.

Turning natural language input into commands that can be executed by a traditional software system is a huge deal.

Microsoft released an AI powered auto complete text box and it’s genius.

Currently you have to type an exact text match in an auto complete box. So if you type cats but the item is called pets you’ll get no results. Now the ai can find context based matches in the auto complete list.

This is their real power.

Also they’re amazing at generating non factual based things. Stories, poems etc.

Blue_Morpho@lemmy.world · 5 months ago

So if you type cats but the item is called pets get no results. Now the ai can find context based matches in the auto complete list.

Google added context search to Gmail and it’s infuriating. I’m looking for an exact phrase that I even put in quotes but Gmail returns a long list of emails that are vaguely related to the search word.

Lmaydev@programming.dev · 5 months ago

That is indeed a poor use. Searching traditionally first and falling back to it would make way more sense.

Blue_Morpho@lemmy.world · 5 months ago

It shouldn’t even automatically fallback. If I am looking for an exact phrase and it doesn’t exist, the result should be “nothing found”, so that I can search somewhere else for the information. A prompt, “Nothing found. Look for related information?” Would be useful.

But returning a list of related information when I need an exact result is worse than not having search at all.

Apytele@sh.itjust.works · 5 months ago

Yes and no. I’ve had to insert a LOT of meaning to get a story worth any substance, and I’ve had to do a lot of editing to get good images. It’s really good at giving me a figure that’s 90% done, but that last 10% touching up still often takes me a day or so of work.

hedgehogging_the_bed@lemmy.world · 5 months ago

Searching with synonym matching is almost.decades old at this point. I worked on it as an undergrad in the early 2000s.and it wasn’t new then, just complicated. Google’s version improved over other search algorithms for a long time.and then trashed it by letting AI take over.

Lmaydev@programming.dev · edit-2 5 months ago

Google’s algorithm has pretty much always used AI techniques.

It doesn’t have to be a synonym. That’s just an example.

Typing diabetes and getting medical services as a result wouldn’t be possible with that technique unless you had a database of every disease to search against for all queries.

The point is AI means you don’t have to have a giant lookup of linked items as it’s trained into it already.

Th4tGuyII@kbin.social · 5 months ago

Exactly. The big problem with LLMs is that they’re so good at mimicking understanding that people forget that they don’t actually have understanding of anything beyond language itself.

The thing they excel at, and should be used for, is exactly what you say - a natural language interface between humans and software.

Like in your example, an LLM doesn’t know what a cat is, but it knows what words describe a cat based on training data - and for a search engine, that’s all you need.

not_amm@lemmy.ml · 5 months ago

That’s why I only use Perplexity. ChatGPT can’t give me sources unless I pay, so I can’t trust information it gives me and it also hallucinated a lot when coding, it was faster to search in the official documentation rather than correcting and debugging code “generated” by ChatGPT.

I use Perplexity + SearXNG, so I can search a lot faster, cite sources and it also makes summaries of your search, so it saves me time while writing introductions and so.

It sometimes hallucinates too and cites weird sources, but it’s faster for me to correct and search for better sources given the context and more ideas. In summary, when/if you’re correcting the prompts and searching apart from Perplexity, you already have something useful.

BTW, I try not to use it a lot, but it’s way better for my workflow.

noodlejetski@lemm.ee · 5 months ago

Their real power is their ability to understand language and context.

…they do exactly none of that.

breakingcups@lemmy.world · 5 months ago

No, but they approximate it. Which is fine for most use cases the person you’re responding to described.

Lmaydev@programming.dev · 5 months ago

They do it much better than anything you can hard code currently.

Voroxpete@sh.itjust.works · 5 months ago

That’s called “fuzzy” matching, it’s existed for a long, long time. We didn’t need “AI” to do that.

Lmaydev@programming.dev · edit-2 5 months ago

No it’s not.

Fuzzy matching is a search technique that uses a set of fuzzy rules to compare two strings. The fuzzy rules allow for some degree of similarity, which makes the search process more efficient.

That allows for mis typing etc. it doesn’t allow context based searching at all. Cat doesn’t fuzz with pet. There is no similarity.

Also it is an AI technique itself.

hedgehogging_the_bed@lemmy.world · 5 months ago

Bullshit, fuzzy matching is a lot older than this AI LLM.

SolNine@lemmy.ml · 5 months ago

The simple solution is not to rely upon AI. It’s like a misinformed relative after a jar of moonshine, they might be right some of the time, or they might be totally full of shit.

I honestly don’t know why people are obsessed with relying on AI, is it that difficult to look up the answer from a reliable source?

force@lemmy.world · edit-2 5 months ago

is it that difficult to look up the answer from a reliable source?

With the current state of search engines and their content (almost completely unrelated garbage and shitty blogs make in like 3 minutes with 1/4 of the content poorly copy-pasted out of context from stackoverflow and most of the rest being pop-ups and ads), YES

SEO ““engineers”” deserve the guillotine

sebinspace@lemmy.world · 5 months ago

If it keeps me from going to stack and interacting with those degenerates, yes

funkless_eck@sh.itjust.works · 5 months ago

because some jobs have to produce a bunch of bullshit text that no one will read quickly, or else parse a bunch of bullshit text for a single phrase in the midst of it all and put it in a report.

ZILtoid1991@lemmy.world · 5 months ago

Sites like that can be blacklisted with web browser plugins. Vastly improved my DuckDuckGo experience for a while, but it’ll be a Whack-A-Mole game from both sides, and yet again my searches are littered with SEO garbage at best, and AI-generated SEO garbage full with made up stuff at worst.

Queen HawlSera@lemm.ee · 5 months ago

The AI isn’t alive, it’s not hallucinating… We will likely never have true AI until we figure out the Hard Problem.

Wirlocke@lemmy.blahaj.zone · 5 months ago

I’m a bit annoyed at all the people being pedantic about the term hallucinate.

Programmers use preexisting concepts as allegory for computer concepts all the time.

Your file isn’t really a file, your desktop isn’t a desk, your recycling bin isn’t a recycling bin.

[Insert the entirety of Object Oriented Programming here]

Neural networks aren’t really neurons, genetic algorithms isn’t really genetics, and the LLM isn’t really hallucinating.

But it easily conveys what the bug is. It only personifies the LLM because the English language almost always personifies the subject. The moment you apply a verb on an object you imply it performed an action, unless you limit yourself to esoteric words/acronyms or you use several words to overexplain everytime.

ZILtoid1991@lemmy.world · 5 months ago

They’re nowadays using it to humanize neural networks, and thus oversell its capabilities.

calcopiritus@lemmy.world · 5 months ago

It’s easily the worst problem if Lemmy. Sometimes one guy has an issue with something and suddenly the whole thread is about that thing, as if everyone thought about it. No, you didn’t think about it, you just read another person’s comment and made another one instead of replying to it.

I never heard anyone complain about the term “hallucination” for AIs, but suddenly in this one thread there are 100 clonic comments instead of a single upvoted ones.

I get it, you don’t like “hallucinate”, just upvote the existing comment about it and move on. If you have anything to add, reply to that comment.

I don’t know why this specific thing is so common on Lemmy though, I don’t think it happened in reddit.

emptiestplace@lemmy.ml · 5 months ago

I don’t know why this specific thing is so common on Lemmy though, I don’t think it happened in reddit.

When you’re used to knowing a lot relative to the people around you, learning to listen sometimes becomes optional.

ZILtoid1991@lemmy.world · 5 months ago

“Hallucination” pretty well describes my opinion on AI generated “content”. I think all of their generation is a hallucination at best.

Garbage in, garbage out.

abrinael@lemmy.world · edit-2 5 months ago

What I don’t like about it is that it makes it sound more benign than it is. Which also points to who decided to use that term - AI promoters/proponents.

Edit: it’s like all of the bills/acts in congress where they name them something like “The Protect Children Online Act” and you ask, “well, what does it do?” And they say something like, “it lets local police read all of your messages so they can look for any dangers to children.”

Wirlocke@lemmy.blahaj.zone · edit-2 5 months ago

In terms of LLM hallucination, it feels like the name very aptly describes the behavior and severity. It doesn’t downplay what’s happening because it’s generally accepted that having a source of information hallucinate is bad.

I feel like the alternatives would downplay the problem. A “glitch” is generic and common, “lying” is just inaccurate since that implies intent to deceive, and just being “wrong” doesn’t get across how elaborately wrong an LLM can be.

Hallucination fits pretty well and is also pretty evocative. I doubt that AI promoters want to effectively call their product schizophrenic, which is what most people think when hearing hallucination.

Ultmately all the sciences are full of analogous names to make conversations easier, it’s not always marketing. No different than when physicists say particles have “spin” or “color” or that spacetime is a “fabric” or [insert entirety of String theory]…

abrinael@lemmy.world · edit-2 5 months ago

After thinking about it more, I think the main issue I have with it is that it sort of anthropomorphises the AI, which is more of an issue in applications where you’re trying to convince the consumer that the product is actually intelligent. (Edit: in the human sense of intelligence rather than what we’ve seen associated with technology in the past.)

You may be right that people could have a negative view of the word “hallucination”. I don’t personally think of schizophrenia, but I don’t know what the majority think of when they hear the word.

zalgotext@sh.itjust.works · 5 months ago

The term “hallucination” has been used for years in AI/ML academia. I reading about AI hallucinations ten years ago when I was in college. The term was originally coined by researchers and mathematicians, not the snake oil salesman pushing AI today.

abrinael@lemmy.world · 5 months ago

I had no idea about this. I studied neural networks briefly over 10 years ago, but hadn’t heard the term until the last year or two.

KeenFlame@feddit.nu · 5 months ago

We were talking about when it was coined, not when you heard it first

MentalEdge@sopuli.xyz · 5 months ago

Altman going “yeah we could make it get things right 100% of the time, but that would be boring” has such “my girlfriend goes to another school” energy it’s not even funny.

AutoTL;DR@lemmings.world · 5 months ago

This is the best summary I could come up with:

All of Silicon Valley — of Big Tech — is focused on taking large language models and other forms of artificial intelligence and moving them from the laptops of researchers into the phones and computers of average people.

But if I type “show me a picture of Alex Cranz” into the prompt window, Meta AI inevitably returns images of very pretty dark-haired men with beards.

Earlier this year, ChatGPT had a spell and started spouting absolute nonsense, but it also regularly makes up case law, leading to multiple lawyers getting into hot water with the courts.

In a commercial for Google’s new AI-ified search engine, someone asked how to fix a jammed film camera, and it suggested they “open the back door and gently remove the film.” That is the easiest way to destroy any photos you’ve already taken.

An AI’s difficult relationship with the truth is called “hallucinating.” In extremely simple terms: these machines are great at discovering patterns of information, but in their attempt to extrapolate and create, they occasionally get it wrong.

This idea that there’s a kind of unquantifiable magic sauce in AI that will allow us to forgive its tenuous relationship with reality is brought up a lot by the people eager to hand-wave away accuracy concerns.

The original article contains 1,211 words, the summary contains 212 words. Saved 82%. I’m a bot and I’m open source!

Zier@fedia.io · 5 months ago

AI making things up? So someone finally invented an electronic replacement for politicians.

Hugin@lemmy.world · 5 months ago

Prisencolinensinainciusol an Italian song that is complete gibberish but made to sound like an English language song. That’s what AI is right now.

https://www.youtube.com/watch?v=RObuKTeHoxo

Flying Squid@lemmy.world · 5 months ago

The Italians actually have a name for that kind of gibberish talking that sounds real. I did some VO work on a project being directed by an Italian guy and he explained what he wanted me to do by explaining the term to me first. I’m afraid it’s been way too long since he told me for me to remember it though.

Another example would be the La Linea cartoons, where the main character speaks a gibberish which seems to approximate Italian to my ears.

https://www.youtube.com/watch?v=ldff__DwMBc

PipedLinkBot@feddit.rocks · 5 months ago

Here is an alternative Piped link(s):

https://www.piped.video/watch?v=RObuKTeHoxo

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

5 months ago

Ais real power its ability to use tools and understand context form existing tools. For a Foss tool that uses an llm to do web searches and generate accurate(not guaranteed) results try my tool https://github.com/muntedcrocodile/Sydney

DdCno1@kbin.social · 5 months ago

Are you seriously trying to push your ChatGPT “tool” in response to an article about language models like this one having substantial issues? “Not guaranteed” - yes, obviously, that’s the point of the article - and from a quick look at your code, I don’t see how this nonsense addresses any of that.

5 months ago

Its not chatgpt that’s just default config u can use the API endpoint to point to any chatgpt api compatible llm. Its can utilise ddg to search for web results then gives u an answer based on that. Most importantly it shows u the full log and u get to read it as it happens like bingAI but transparent so checking it is right there. I’ll add some screenshots to the readme.

DdCno1@kbin.social · 5 months ago

Its not chatgpt that’s just default config u can use the API endpoint to point to any chatgpt api compatible llm.

Since the issue with hallucinations is shared by all LLMs, not just ChatGPT, this doesn’t change anything.

5 months ago

Its transparent in its operation allows u to see what its thinking and catch errors and use ur own fine tuning that isn’t censored.

OozingPositron@feddit.cl · 5 months ago

>The verge

Don’t take away the hallucinations, how am I supposed to do ERP with the models then?

Cyberflunk@lemmy.world · 5 months ago

Holy shit. Dunning Kruger is fully engaged in these post comments

Voroxpete@sh.itjust.works · 5 months ago

We not only have to stop ignoring the problem, we need to be absolutely clear about what the problem is.

LLMs don’t hallucinate wrong answers. They hallucinate all answers. Some of those answers will happen to be right.

If this sounds like nitpicking or quibbling over verbiage, it’s not. This is really, really important to understand. LLMs exist within a hallucinatory false reality. They do not have any comprehension of the truth or untruth of what they are saying, and this means that when they say things that are true, they do not understand why those things are true.

That is the part that’s crucial to understand. A really simple test of this problem is to ask ChatGPT to back up an answer with sources. It fundamentally cannot do it, because it has no ability to actually comprehend and correlate factual information in that way. This means, for example, that AI is incapable of assessing the potential veracity of the information it gives you. A human can say “That’s a little outside of my area of expertise,” but an LLM cannot. It can only be coded with hard blocks in response to certain keywords to cut it from answering and insert a stock response.

This distinction, that AI is always hallucinating, is important because of stuff like this:

But notice how Reid said there was a balance? That’s because a lot of AI researchers don’t actually think hallucinations can be solved. A study out of the National University of Singapore suggested that hallucinations are an inevitable outcome of all large language models. **Just as no person is 100 percent right all the time, neither are these computers. **

That is some fucking toxic shit right there. Treating the fallibility of LLMs as analogous to the fallibility of humans is a huge, huge false equivalence. Humans can be wrong, but we’re wrong in ways that allow us the capacity to grow and learn. Even when we are wrong about things, we can often learn from how we are wrong. There’s a structure to how humans learn and process information that allows us to interrogate our failures and adjust for them.

When an LLM is wrong, we just have to force it to keep rolling the dice until it’s right. It cannot explain its reasoning. It cannot provide proof of work. I work in a field where I often have to direct the efforts of people who know more about specific subjects than I do, and part of how you do that is you get people to explain their reasoning, and you go back and forth testing propositions and arguments with them. You say “I want this, what are the specific challenges involved in doing it?” They tell you it’s really hard, you ask them why. They break things down for you, and together you find solutions. With an LLM, if you ask it why something works the way it does, it will commit to the bit and proceed to hallucinate false facts and false premises to support its false answer, because it’s not operating in the same reality you are, nor does it have any conception of reality in the first place.

UnpluggedFridge@lemmy.world · 5 months ago

I think where you are going wrong here is assuming that our internal perception is not also a hallucination by your definition. It absolutely is. But our minds are embodied, thus we are able check these hallucinations against some outside stimulus. Your gripe that current LLMs are unable to do that is really a criticism of the current implementations of AI, which are trained on some data, frozen, then restricted from further learning by design. Imagine if your mind was removed from all stimulus and then tested. That is what current LLMs are, and I doubt we could expect a human mind to behave much better in such a scenario. Just look at what happens to people cut off from social stimulus; their mental capacities degrade rapidly and that is just one type of stimulus.

Another problem with your analysis is that you expect the AI to do something that humans cannot do: cite sources without an external reference. Go ahead right now and from memory cite some source for something you know. Do not Google search, just remember where you got that knowledge. Now who is the one that cannot cite sources? The way we cite sources generally requires access to the source at that moment. Current LLMs do not have that by design. Once again, this is a gripe with implementation of a very new technology.

The main problem I have with so many of these “AI isn’t really able to…” arguments is that no one is offering a rigorous definition of knowledge, understanding, introspection, etc in a way that can be measured and tested. Further, we just assume that humans are able to do all these things without any tests to see if we can. Don’t even get me started on the free will vs illusory free will debate that remains unsettled after centuries. But the crux of many of these arguments is the assumption that humans can do it and are somehow uniquely able to do it. We had these same debates about levels of intelligence in animals long ago, and we found that there really isn’t any intelligent capability that is uniquely human.

mindlesscrollyparrot@discuss.tchncs.de · 5 months ago

This seems to be a really long way of saying that you agree that current LLMs hallucinate all the time.

I’m not sure that the ability to change in response to new data would necessarily be enough. They cannot form hypotheses and, even if they could, they have no way to test them.

UnpluggedFridge@lemmy.world · 5 months ago

My thesis is that we are asserting the lack of human-like qualities in AIs that we cannot define or measure. Assertions should be made on data, not uneasy feelings arising when an LLM falls into the uncanny valley.

mindlesscrollyparrot@discuss.tchncs.de · 5 months ago

But we do know how they operate. I saw a post a while back where somebody asked the LLM how it was calculating (incorrectly) the date of Easter. It answered with the formula for the date of Easter. The only problem is that that was a lie. It doesn’t calculate. You or I can perform long multiplication if asked to, but the LLM can’t (ironically, since the hardware it runs on is far better at multiplication than we are).

5gruel@lemmy.world · 5 months ago

I’m not convinced about the “a human can say ‘that’s a little outside my area of expertise’, but an LLM cannot.” I’m sure there are a lot of examples in the training data set that contains qualification of answers and expression of uncertainty, so why would the model not be able to generate that output? I don’t see why it would require an “understanding” for that specifically. I would suspect that better human reinforcement would make such answers possible.

dustyData@lemmy.world · 5 months ago

Because humans can do introspection and think and reflect about our own knowledge against the perceived expertise and knowledge of other humans. There’s nothing in LLMs models capable of doing this. An LLM cannot asses it own state, and even if it could, it has nothing to contrast it to. You cannot develop the concept of ignorance without an other to interact and compare with.

dustyData@lemmy.world · 5 months ago

This right here is also the reason why AI fanboys get angry when they are told that LLMs are not intelligent or even thinking at all. They don’t understand that in order for rational intelligence to exist, the LLMs should be able to have an internal, referential inner world of symbols, to contrast external input (training data) against and that is also capable of changing and molding to reality and truth criteria. No, tokens are not what I’m talking about. I’m talking about an internally consistent and persistent representation of the world. An identity, which is currently antithetical with the information model used to train LLMs. Let me try to illustrate.

I don’t remember the details or technical terms but essentially, animal intelligence needs to experience a lot of things first hand in order to create an individualized model of the world which is used to direct behavior (language is just one form of behavior after all). This is very slow and labor intensive, but it means that animals are extremely good, when they get good, at adapting said skills to a messy reality. LLMs are transactional, they rely entirely on the correlation of patterns of input to itself. As a result they don’t need years of experience, like humans for example, to develop skilled intelligent responses. They can do it in hours of sensing training input instead. But at the same time, they can never be certain of their results, and when faced with reality, they crumble because it’s harder for it to adapt intelligently and effectively to the mess of reality.

LLMs are a solipsism experiment. A child is locked in a dark cave with nothing but a dim light and millions of pages of text, assume immortality and no need for food or water. As there is nothing else to do but look at the text they eventually develop the ability to understand how the symbols marked on the text relate to each other, how they are usually and typically assembled one next to the other. One day, a slit on a wall opens and the person receives a piece of paper with a prompt, a pencil and a blank page. Out of boredom, the person looks at the prompt, it recognizes the symbols and the pattern, and starts assembling the symbols on the blank page with the pencil. They are just trying to continue from the prompt what they think would typically follow or should follow afterwards. The slit in the wall opens again, and the person intuitively pushes the paper it just wrote into the slit.

For the people outside the cave, leaving prompts and receiving the novel piece of paper, it would look like an intelligent linguistic construction, it is grammatically correct, the sentences are correctly punctuated and structured. The words even make sense and it says intelligent things in accordance to the training text left inside and the prompt given. But once in a while it seems to hallucinate weird passages. They miss the point that, it is not hallucinating, it just has no sense of reality. Their reality is just the text. When the cave is opened and the person trapped inside is left into the light of the world, it would still be profoundly ignorant about it. When given the word sun, written on a piece of paper, they would have no idea that the word refers to the bright burning ball of gas above them. It would know the word, it would know how it is usually used to assemble text next to other words. But it won’t know what it is.

LLMs are just like that, they just aren’t actually intelligent as the person in this mental experiment. Because there’s no way, currently, for these LLMs to actually sense and correlate the real world, or several sources of sensors into a mentalese internal model. This is currently the crux and the biggest problem on the field of AI as I understand it.

UnpluggedFridge@lemmy.world · 5 months ago

How do hallucinations preclude an internal representation? Couldn’t hallucinations arise from a consistent internal representation that is not fully aligned with reality?

I think you are misunderstanding the role of tokens in LLMs and conflating them with internal representation. Tokens are used to generate a state, similar to external stimuli. The internal representation, assuming there is one, is the manner in which the tokens are processed. You could say the same thing about human minds, that the representation is not located anywhere like a piece of data; it is the manner in which we process stimuli.

dustyData@lemmy.world · edit-2 5 months ago

Not really. Reality is mostly a social construction. If there’s not an other to check and bring about meaning, there is no reality, and therefore no hallucinations. More precisely, everything is a hallucination. As we cannot cross reference reality with LLMs and it cannot correct itself to conform to our reality. It will always hallucinate and it will only coincide with our reality by chance.

I’m not conflating tokens with anything, I explicitly said they aren’t an internal representation. They’re state and nothing else. LLMs don’t have an internal representation of reality. And they probably can’t given their current way of working.

UnpluggedFridge@lemmy.world · edit-2 4 months ago

You seem pretty confident that LLMs cannot have an internal representation simply because you cannot imagine how that capability could emerge from their architecture. Yet we have the same fundamental problem with the human brain and have no problem asserting that humans are capable of internal representation. LLMs adhere to grammar rules, present information with a logical flow, express relationships between different concepts. Is this not evidence of, at the very least, an internal representation of grammar?

We take in external stimuli and peform billions of operations on them. This is internal representation. An LLM takes in external stimuli and performs billions of operations on them. But the latter is incapable of internal representation?

And I don’t buy the idea that hallucinations are evidence that there is no internal representation. We hallucinate. An internal representation does not need to be “correct” to exist.

dustyData@lemmy.world · edit-2 4 months ago

Yet we have the same fundamental problem with the human brain

And LLMs aren’t human brains, they don’t even work remotely similarly. An LLM has more in common with an Excel spreadsheet than with a neuron. Read on the learning models and pattern recognition theories behind LLMs, they are explicitly designed to not function like humans. So we cannot assume that the same emergent properties exist on an LLM.

UnpluggedFridge@lemmy.world · 4 months ago

Nor can we assume that they cannot have the same emergent properties.

dustyData@lemmy.world · 4 months ago

That’s not how science works. You are the one claiming it does, you have the burden of proof to prove they have the same properties. Thus far, assuming they don’t as they aren’t human is the sensible rational route.

Cyberflunk@lemmy.world · 5 months ago

Wtf are you even talking about.

UnsavoryMollusk@lemmy.world · edit-2 5 months ago

They are right though. LLM at their core are just about determining what is statistically the most probable to spit out.

Cyberflunk@lemmy.world · 5 months ago

Your 1 sentence makes more sense than the slop above.

Aceticon@lemmy.world · 5 months ago

That’s an excellent methaphor for LLMs.

feedum_sneedson@lemmy.world · 5 months ago

It’s the Chinese room thought experiment.

Aceticon@lemmy.world · 5 months ago

Hadn’t heard about it before (or maybe I did but never looked into it), so I just went and found it in Wikipedia and will be reading all about it.

So thanks for the info!

feedum_sneedson@lemmy.world · 5 months ago

No worries. The person above did a good job explaining it although they kind of mashed it together with the imagery from Plato’s allegory of the cave.

Hello Hotel@lemmy.world · 5 months ago

usually, what I see is that the REPL they are using is never introspective enough. The ai cant on its own revert to a prevous state or give notes to itself because the response being fast and in linear time matters for a chatbot. ChatGPT can make really cool stuff when you ask it to break it’s thoght process into steps. Ones it usually fails spectacularly at. It was like pulling teeth to get it to actually do the steps and not just give the bad answer anyway.

EatATaco@lemm.ee · 5 months ago

they do not understand why those things are true.

Some researchers compared the results of questions between chat gpt 3 and 4. One of the questions was about stacking items in a stable way. Chat gpt 3 just, in line with what you are saying about “without understanding”, listed the items saying to place them one on top of each other. No way it would have worked.

Chat gpt 4, however, said that you should put the book down first, put the eggs in a 3 x 3 grid on top of the book, trap them in a way with a laptop so they don’t roll around, and then put the bottle on top of the laptop standing up, and then balance the nail on the top of it…even noting you have to put the flat end of the nail down. This sounds a lot like understanding to me and not just rolling the dice hoping to be correct.

Yes, AI confidently gets stuff wrong. But let’s all note that there is a whole subreddit dedicated to people being confidently wrong. One doesn’t need to go any further than Lemmy to see people confidently claiming to know the truth about shit they should know is outside of their actual knowledge. We’re all guilty of this. Including refusing to learn when we are wrong. Additionally, the argument that they can’t learn doesn’t make sense because models have definitely become better.

Now I’m not saying ai is conscious, I really don’t know, but all of your shortcomings you’ve listed humans are guilty of too. So to use it as examples as to why it’s always just a hallucination, or that our thoughts are not, doesn’t seem to hold much water to me.

insaan@leftopia.org · 5 months ago

the argument that they can’t learn doesn’t make sense because models have definitely become better.

They have to be either trained with new data or their internal structure has to be improved. It’s an offline process, meaning they don’t learn through chat sessions we have with them (if you open a new session it will have forgotten what you told it in a previous session), and they can’t learn through any kind of self-directed research process like a human can.

all of your shortcomings you’ve listed humans are guilty of too.

LLMs are sophisticated word generators. They don’t think or understand in any way, full stop. This is really important to understand about them.

EatATaco@lemm.ee · 5 months ago

They have to be either trained with new data or their internal structure has to be improved. It’s an offline process, meaning they don’t learn through chat sessions we have with them (if you open a new session it will have forgotten what you told it in a previous session), and they can’t learn through any kind of self-directed research process like a human can.

Most human training is done through the guidance of another, additionally, most of this is training is done through an automated process where some computer is just churning through data. And while you are correct that the context does not exist from one session to the next, you can in fact teach it something and it will maintain it during the session. It’s just like moving to a new session is like talking to completely different person, and you’re basically arguing “well, I explained this one thing to another human, and this human doesn’t know it. . .so how can you claim it’s thinking?” And just imagine the disaster that would happen if you would just allow it to be trained by anyone on the web. It would be spitting out memes, racism, and right wing propaganda within days. lol

They don’t think or understand in any way, full stop.

I just gave you an example where this appears to be untrue. There is something that looks like understanding going on. Maybe it’s not, I’m not claiming to know, but I have not seen a convincing argument as to why. Saying “full stop” instead of an actual argument as to why just indicates to me that you are really saying “stop thinking.” And I apologize but that’s not how I roll.

insaan@leftopia.org · edit-2 5 months ago

Most human training is done through the guidance of another

Let’s take a step back and not talk about training at all, but about spontaneous learning. A baby learns about the world around it by experiencing things with its senses. They learn a language, for example, simply by hearing it and making connections - getting corrected when they’re wrong, yes, but they are not trained in language until they’ve already learned to speak it. And once they are taught how to read, they can then explore the world through signs, books, the internet, etc. in a way that is often self-directed. More than that, humans are learning at every moment as they interact with the world around them and with the written word.

An LLM is a static model created through exposure to lots and lots of text. It is trained and then used. To add to the model requires an offline training process, which produces a new version of the model that can then be interacted with.

you can in fact teach it something and it will maintain it during the session

It’s still not learning anything. LLMs have what’s known as a context window that is used to augment the model for a given session. It’s still just text that is used as part of the response process.

They don’t think or understand in any way, full stop.

I just gave you an example where this appears to be untrue. There is something that looks like understanding going on.

You seem to have ignored the preceding sentence: “LLMs are sophisticated word generators.” This is the crux of the matter. They simply do not think, much less understand. They are simply taking the text of your prompts (and the text from the context window) and generating more text that is likely to be relevant. Sentences are generated word-by-word using complex math (heavy on linear algebra and probability) where the generation of each new word takes into account everything that came before it, including the previous words in the sentence it’s a part of. There is no thinking or understanding whatsoever.

This is why Voroxpete@sh.itjust.works said in the original post to this thread, “They hallucinate all answers. Some of those answers will happen to be right.” LLMs have no way of knowing if any of the text they generate is accurate for the simple fact that they don’t know anything at all. They have no capacity for knowledge, understanding, thought, or reasoning. Their models are simply complex networks of words that are able to generate more words, usually in a way that is useful to us. But often, as the hallucination problem shows, in ways that are completely useless and even harmful.

EatATaco@lemm.ee · 5 months ago

An LLM is a static model created through exposure to lots and lots of text. It is trained and then used. To add to the model requires an offline training process, which produces a new version of the model that can then be interacted with.

But this is a deliberate decision, not an inherent limitation. The model could get feedback from the outside world, in fact this is how it’s trained (well, data is fed back into the model to update it). Of course we are limiting it to words, rather than a whole slew of inputs that a human gets. But keep in mind we have things like music and image generation AI as well. So it’s not like it can’t be also be trained on these things. Again, deliberate decision rather than inherent limitation.

We both even agree it’s true that it can learn from interacting with the world, you just insist that because it isn’t persisting, that doesn’t actually count. But it does persist, just not the the new inputs from users. And this is done deliberately to protect the models from what would inevitably happen. That being said, it’s also been fed arguably more input than a human would get in their whole life, just condescended into a much smaller period of time. So if it’s “total input” then the AI is going to win, hands down.

You seem to have ignored the preceding sentence: “LLMs are sophisticated word generators.”

I’m not ignoring this. I understand that it’s the whole argument, it gets repeated around here enough. Just saying it doesn’t make it true, however. It may be true, again I’m not sure, but simply stating and saying “full stop” doesn’t amount to a convincing argument.

They simply do not think, much less understand.

It’s not as open and shut as you wish it to be. If anyone is ignoring anything here, it’s you ignoring the fact that it went from basically just, as you said, randomly stacking objects it was told to stack stably, to actually doing so in a way that could work and describing why you would do it that way. Additionally there is another case where they asked chat gpt4 to draw a unicorn using an obscure programming language. And you know what? It did it. It was rudimentary, but it was clearly a unicorn. This is something that wasn’t trained on images at all. They even messed with the code, turning the unicorn around, removing the horn, fed it back in, and then asked it to replace the horn, and it put it back on correctly. It seemed to understand not only what an unicorn looked like, but what was the horn and where it should go when it was removed.

So to say it just can “generate more words” is something you can accuse us of as well, or possibly even just overly reductive of what it’s capable of even now.

But often, as the hallucination problem shows, in ways that are completely useless and even harmful.

There are all kinds of problems with human memory, where we imagine things all of the time. You’ve ever taken acid? If so, you would see how unreliable our brains are at always interpreting reality. And you want to really trip? Eye witness testimony is basically garbage. I exaggerate a bit, but there are so many flaws with it with people remembering things that didn’t happen, and it’s so easy to create false memories, that it’s not as convincing as it should be. Hell, it can even be harmful by convicting an innocent person.

Every short coming you’ve used to claim AI isn’t real thinking is something shared with us. It might just be inherent to intelligence to be wrong sometimes.

feedum_sneedson@lemmy.world · 5 months ago

It’s exciting either way. Maybe it’s equivalent to a certain lobe of the brain, and we’re judging it for not being integrated with all the other parts.

KeenFlame@feddit.nu · 5 months ago

You are just wrong

AstralPath@lemmy.ca · 5 months ago

A source link to what you’re referring to would be nice.

EatATaco@lemm.ee · 5 months ago

https://www.businessinsider.com/chatgpt-open-ai-balancing-task-convinced-microsoft-agi-closer-2023-5

el_bhm@lemm.ee · 5 months ago

They do not have any comprehension of the truth or untruth of what they are saying, and this means that when they say things that are true, they do not understand why those things are true.

Which can be beautifully exploited with sponsored content.

See Google I/O '24.

KeenFlame@feddit.nu · 5 months ago

Very long layman take. Why is your guesstimation so incredibly crucial to understand, then next thing important to understand then really, really important to understand, over and over, when you are not an expert?

JackGreenEarth@lemm.ee · 5 months ago

Don’t let perfect be the enemy of good.

androogee (they/she)@midwest.social · 5 months ago

“good”

Ibaudia@lemmy.world · 5 months ago

Generative AI is hardly “good” yet, either morally or as a product.

KeenFlame@feddit.nu · 5 months ago

Yes wow its so not an impressive discovery at all, obviously for sure 1000%

JackGreenEarth@lemm.ee · 5 months ago

It’s very useful as a product, for creating images, stories, poems, code. Just because you don’t use it doesn’t mean it isn’t good.

Ibaudia@lemmy.world · 5 months ago

I use it for sure, I even pay for Gemini for its creative writing capabilities, but most LLMs are bad at many tasks they’re advertised to be good at (coding being one of those things), plus they’re largely based on stolen work and/or copyright infringement. They don’t reliably do what they’re claiming, and they are unethically developed. Hence, they’re bad products, just objectively.

JackGreenEarth@lemm.ee · 5 months ago

Are you also against copyright infringement when it’s piracy? Because I’m not, so it would be inconsistent to judge open source generative AI models for doing something I don’t consider wrong.

Ibaudia@lemmy.world · 5 months ago

I consider piracy wrong when companies are stealing from creatives (like authors whose books are included with no credit or royalties) for the purposes of profit. I don’t believe all piracy is always good full stop. I believe piracy is ethical if it allows for preservation of content that may otherwise not be preserved or maintained.

Also that was just one of my points lol. Most LLMs are still just bad at what they are claiming to be able to do.