Background: This Nomic blog article from September 2023 promises better performance in GPT4All for AMD graphics card owners.

Run LLMs on Any GPU: GPT4All Universal GPU Support

Likewise on GPT4All’s GitHub page.

September 18th, 2023: Nomic Vulkan launches supporting local LLM inference on NVIDIA and AMD GPUs.

Problem: In GPT4All, under Settings > Application Settings > Device, I’ve selected my AMD graphics card, but I’m seeing no improvement over CPU performance. In both cases (AMD graphics card or CPU), it crawls along at about 4-5 tokens per second. The interaction in the screenshot below took 174 seconds to generate the response.

Question: Do I have to use a specific model to benefit from this advancement? Do I need to install a different AMD driver? What steps can I take to troubleshoot this?

Sorry if this is an obvious question. Sometimes I feel like the answer is right in front of me, but I’m unsure of which key words from the documentation should jump out at me.

My system info:

  • GPU: Radeon RX 6750 XT
  • CPU: Ryzen 7 5800X3D processor
  • RAM: 32 GB @ 3200 MHz
  • OS: Linux Bazzite
  • I’ve installed GPT4All as a flatpak
  • yo_scottie_oh@lemmy.mlOP
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 months ago

    Thanks for the info—maybe I’ll give this another whirl when I have some more time.

    Which card are you running on?

    • pebbles@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 months ago

      I use the 24GB 7900 XTX.

      I wonder why ROCm 6.4 doesn’t support you, but ROCm 6.3 does. Maybe there is a way to downgrade. Also that override_gfx environment variable may be enough to get 6.4 working for you. Not sure though.

      I’d say an easy route (if it works lol) would be using dnf to install ROCm, and then use LM studio’s installer to get the rest.