Background: This Nomic blog article from September 2023 promises better performance in GPT4All for AMD graphics card owners.
Run LLMs on Any GPU: GPT4All Universal GPU Support
Likewise on GPT4All’s GitHub page.
September 18th, 2023: Nomic Vulkan launches supporting local LLM inference on NVIDIA and AMD GPUs.
Problem: In GPT4All, under Settings > Application Settings > Device, I’ve selected my AMD graphics card, but I’m seeing no improvement over CPU performance. In both cases (AMD graphics card or CPU), it crawls along at about 4-5 tokens per second. The interaction in the screenshot below took 174 seconds to generate the response.
Question: Do I have to use a specific model to benefit from this advancement? Do I need to install a different AMD driver? What steps can I take to troubleshoot this?
Sorry if this is an obvious question. Sometimes I feel like the answer is right in front of me, but I’m unsure of which key words from the documentation should jump out at me.
My system info:
- GPU: Radeon RX 6750 XT
- CPU: Ryzen 7 5800X3D processor
- RAM: 32 GB @ 3200 MHz
- OS: Linux Bazzite
- I’ve installed GPT4All as a flatpak
Okay I rechecked and it looks like 6.4 and 6.3 have similar compatibility/incompatibility with certain cards.
Here are the gfx versions of different amd cards:
https://rocm.docs.amd.com/en/develop/reference/gpu-arch-specs.html
Here are the supported versions of 6.4
https://rocm.docs.amd.com/en/docs-6.4.0/compatibility/compatibility-matrix.html
So given this extra bit of research it looks like you may be able to run ROCm on a 6950XT but I’m not sure about a 6750XT.
From my experience ROCm supports more than they say they do. They say they support the cards they’ve tested, but other’s still may work. I was running ROCm on my 7900 XTX before they officially supported it.
I probably would not have noticed that. I’ll have to look into this some more. Thanks for all your help.