Lugh@futurology.todayM to

Futurology@futurology.todayEnglish · 10 个月前

Two-faced AI language models learn to hide deception - ‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.

9

12

Two-faced AI language models learn to hide deception - ‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.

Lugh@futurology.todayM to

Futurology@futurology.todayEnglish · 10 个月前

9

Two-faced AI language models learn to hide deception

‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.

Chat

mateomaui@reddthat.com
link
fedilink
English
arrow-up
2·
10 个月前
Alright, I’ll be out back digging the bomb shelter.
- Possibly linux@lemmy.zip
  link
  fedilink
  English
  arrow-up
  1·
  edit-2
  10 个月前
  Its too late for that honestly
  - mateomaui@reddthat.com
    link
    fedilink
    English
    arrow-up
    2·
    10 个月前
    Alright, I’ll switch to digging holes for the family burial ground.