Intel uses SYCL and oneAPI for acceleration. These also target NVIDIA GPUs and AMD GPUs.
Unlike AMD GPUs, Intel does not separate iGPU VRAM from system RAM, which means iGPU can make full use of it.
Still cheaper than Mac Studio, though overall memory is smaller.
Disable power feature in case multi-GPU program does not work expectedly. More info here.
1 | sudo vim /etc/default/grub |
The codename: gfx803
You may have to build it yourself
Setup environmental parameters and install drivers:
1 | sudo echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment |
Build torch:
1 | git clone https://github.com/pytorch/pytorch.git -b v1.13.1 |
If you want to use this beefy GPU for computation, then either prepare a suitable ventalized desktop frame or use external GPU connected by OCuLink, which can be found on latest MiniPCs and laptops.
Your integrated GPU gfx90c
can be used for AI.
To run it without container, you build it with codename gfx900
.
Either way, you need to specify export HSA_OVERRIDE_GFX_VERSION=9.0.0
.
Run a container:
1 | sudo docker run --rm -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G rocm/pytorch:latest |
If you want to run ollama
on AMD GPUs, you must install ROCm 6.
Additinally if the card is gfx90c
, you need to run export HSA_ENABLE_SDMA=0
.
You can get current ROCm version by dpkg -l | grep -i rocm
.
You can disable GPU by export HSA_OVERRIDE_GFX_VERSION=1
.
Since latest ollama
accesses ROCm, run it with root
account.
In order to circumvent BIOS VRAM limitation for APU, you can follow the instruction here.
Related repos:
force-host-alloction-APU (by hooking VRAM allocators)