We have to stop ignoring AI’s hallucination problem

misk@sopuli.xyz · 6 months ago

We have to stop ignoring AI’s hallucination problem

CaptainSpaceman@lemmy.world · 6 months ago

Its not just a calculator though.

Image generation requires no fact checking whatsoever, and some of the tools can do it well.

That said, LLMs will always have limitations and true AI is still a ways away.

elephantium@lemmy.world · 6 months ago

Image generation requires no fact checking whatsoever

Sure it does. Let’s say IKEA wants to use midjourney to generate images for its furniture assembly instructions. The instructions are already written, so the prompt is something like “step 3 of assembling the BorkBork kitchen table”.

Would you just auto-insert whatever it generated and send it straight to the printer for 20000 copies?

Or would you look at the image and make sure that it didn’t show a couch instead?

If you choose the latter, that’s fact checking.

That said, LLMs will always have limitations and true AI is still a ways away.

I can’t agree more strongly with this point!

sudneo@lemm.ee · 6 months ago

It does require fact-checking. You might ask a human and get someone with 10 fingers on one hand, you might ask people in the background and get blobs merged on each other. The fact check in images is absolutely necessary and consists of verifying that the generate image adheres to your prompt and that the objects in it match their intended real counterparts.

I do agree that it’s a different type of fact checking, but that’s because an image is not inherently correct or wrong, it only is if compared to your prompt and (where applicable) to reality.

pixel_prophet@lemm.ee · 6 months ago

The biggest disappointment in the image generation capabilities was the realisation that there is no object permanence there in terms of components making up an image so for any specificity you’re just playing whackamole with iterations that introduce other undesirable shit no matter how specific you make your prompts.

They are also now heavily nerfing the models to avoid lawsuits by just ignoring anything relating to specific styles that may be considered trademarks, problem is those are often industry jargon so now you’re having to craft more convoluted prompts and get more mid results.