Rx580 16G Used As Ai Accelerator

This article discusses the use of an AMD RX580 16g GPU for AI acceleration with SYCL/oneAPI, comparing it to Mac Studio. It provides step-by-step instructions on installing drivers, building Torch, and configuring the system for optimal GPU utilization. The article also explores ROCm 6 and its VRAM limitations, offering insights into the performance differences between the two systems.

Intel uses SYCL and oneAPI for acceleration. These also target NVIDIA GPUs and AMD GPUs.

Unlike AMD GPUs, Intel does not separate iGPU VRAM from system RAM, which means iGPU can make full use of it.

Still cheaper than Mac Studio, though overall memory is smaller.

Disable power feature in case multi-GPU program does not work expectedly. More info here.

sudo vim /etc/default/grub
# add "amdgpu.ppfeaturemask=0xffff3fff amdgpu.runpm=0x0" into GRUB_CMDLINE_LINUX_DEFAULT
sudo update-grub
reboot
cat /proc/cmdline
# see if the modification takes effect

The codename: gfx803

You may have to build it yourself

Setup environmental parameters and install drivers:

sudo echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment
sudo echo HSA_OVERRIDE_GFX_VERSION=8.0.3 >> /etc/environment
# reboot
wget https://repo.radeon.com/amdgpu-install/22.40.3/ubuntu/focal/amdgpu-install_5.4.50403-1_all.deb
sudo apt install ./amdgpu-install_5.4.50403-1_all.deb
sudo amdgpu-install -y --usecase=rocm,hiplibsdk,mlsdk
sudo usermod -aG video $LOGNAME
sudo usermod -aG render $LOGNAME
# verify
rocminfo
clinfo

Build torch:

git clone https://github.com/pytorch/pytorch.git -b v1.13.1
cd pytorch
export PATH=/opt/rocm/bin:$PATH ROCM_PATH=/opt/rocm HIP_PATH=/opt/rocm/hip
export PYTORCH_ROCM_ARCH=gfx803
export PYTORCH_BUILD_VERSION=1.13.1 PYTORCH_BUILD_NUMBER=1
python3 tools/amd_build/build_amd.py
USE_ROCM=1 USE_NINJA=1 python3 setup.py bdist_wheel
pip3 install dist/torch-1.13.1-cp310-cp310-linux_x86_64.whl

If you want to use this beefy GPU for computation, then either prepare a suitable ventalized desktop frame or use external GPU connected by OCuLink, which can be found on latest MiniPCs and laptops.

Your integrated GPU gfx90c can be used for AI.

To run it without container, you build it with codename gfx900.

Either way, you need to specify export HSA_OVERRIDE_GFX_VERSION=9.0.0.

Run a container:

1 2	sudo docker run --rm -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G rocm/pytorch:latest

If you want to run ollama on AMD GPUs, you must install ROCm 6.

Additinally if the card is gfx90c, you need to run export HSA_ENABLE_SDMA=0.

You can get current ROCm version by dpkg -l | grep -i rocm.

You can disable GPU by export HSA_OVERRIDE_GFX_VERSION=1.

Since latest ollama accesses ROCm, run it with root account.

In order to circumvent BIOS VRAM limitation for APU, you can follow the instruction here.

Related repos:

torch-apu-helper
force-host-alloction-APU (by hooking VRAM allocators)

Rx580 16G Used As Ai Accelerator

Comments