this post was submitted on 26 May 2024
186 points (98.9% liked)

Linux

48077 readers
799 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] wewbull@feddit.uk 4 points 5 months ago (1 children)

"set the magic environment variable because the tool chain will mis-detect the architecture of your unsupported card"

I don't think it's misdetecting it. Rather it detects it correctly, tries to use specific support for that device, but then finds that the support was switched off at compile time. The environment variable forces it to pretend to be a different (very similar) device.

Clunky, yes.

[–] PAPPP@lemmy.sdf.org 1 points 5 months ago

That's credible.

I find the hardware architecture and licensing situation with AMD much more appealing than Nivida and really want to like their cards for compute, but they sure make it challenging to recommend.

I had to do a little dead reckoning with the list of supported targets to find one that did the right thing with the 12CU RDNA2 680M.

I've been meaning to put my findings on the internet since it might be useful to someone else, this is a good a place as any.

On a fresh Xubuntu 22.04.4 LTS install doing the official ROCm 6.1 setup instructions, using a Minisforum UM690S Ryzen 9 6900HX/64GB/1TB box as the target, and after setting the GPU Memory to 8GB in the EFI before boot so it doesn't OOM.

For OpenMP projects, you'll probably need to install libstdc++-12-dev in addition to the documented stuff because HIP won't see the cmath libs otherwise (bug), then the <CMakeConfig.txt> mods for adapting a project with accelerator directives to that target are

find_package(hip REQUIRED)
list(APPEND CMAKE_PREFIX_PATH /opt/rocm-6.1.0)
set(CMAKE_CXX_COMPILER ${HIP_HIPCC_EXECUTABLE})
set(CMAKE_CXX_LINKER   ${HIP_HIPCC_EXECUTABLE})
target_compile_options(yourtargetname PUBLIC "-lm;-fopenmp;-fopenmp-targets=amdgcn-amd-amdhsa;-Xopenmp-target=amdgcn-amd-amdhsa;-march=gfx1035"

And torch, because I was curious how that would go (after I watched the Docker based suggested method download 30GB of trash then fall over, and did the bare metal install instead) seems to work with PYTORCH_TEST_WITH_ROCM=1 HSA_OVERRIDE_GFX_VERSION=10.3.0 python3 testtorch.py which is the most confidence inspiring.

Also amdgpu_top is your friend for figuring out if you actually have something on the GPU compute pipes or if it's just lying and running on the CPU.