A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://developer.codeplay.com/products/oneapi/amd/guides/get-started-guide-amd below:

Install oneAPI for AMD GPUs - Guides - oneAPI for AMD GPUs - Products

This guide contains information on using DPC++ to run SYCL™ applications on AMD GPUs via the DPC++ HIP plugin version 2025.2.0.

For general information about DPC++, refer to the DPC++ Resources section.

Supported Platformslink

This release should work across a wide array of AMD GPUs, but Codeplay cannot guarantee correct operation on untested platforms.

The following platforms should work:

This release has been tested on the following platforms (using the upstream AMD GPU driver in the Linux kernel):

Operating System

Tested GPU

ROCm versions

Linux (glibc 2.31+)

AMD Instinct MI210 (gfx90a)

5.4.3, 6.0, 6.1, 6.2, 6.3

Prerequisiteslink
  1. Install C++ development tools.

    You will need the following C++ development tools installed in order to build and run oneAPI applications: cmake, gcc, g++, make and pkg-config.

    The following console commands will install the above tools on the most popular Linux distributions:

    Ubuntu

    sudo apt update
    sudo apt -y install cmake pkg-config build-essential
    
    

    Red Hat and Fedora

    sudo yum update
    sudo yum -y install cmake pkgconfig
    sudo yum groupinstall "Development Tools"
    
    

    SUSE

    sudo zypper update
    sudo zypper --non-interactive install cmake pkg-config
    sudo zypper --non-interactive install pattern devel_C_C++
    
    

    Verify that the tools are installed by running:

    which cmake pkg-config make gcc g++
    
    

    You should see output similar to:

    /usr/bin/cmake
    /usr/bin/pkg-config
    /usr/bin/make
    /usr/bin/gcc
    /usr/bin/g++
    
    
  2. Install an Intel® oneAPI Toolkit version 2025.2.0, that contains the DPC++/C++ Compiler.

  3. Install the GPU driver and ROCm™ software stack for the AMD GPU.

Installationlink
  1. Download the latest oneAPI for AMD GPUs installer from our website.

  2. Run the downloaded self-extracting installer:

    bash oneapi-for-amd-gpus-2025.2.0-linux.sh

Set Up Your Environmentlink
  1. To set up your oneAPI environment in your current session, source the Intel-provided setvars.sh script.

    For system-wide installations:

    . /opt/intel/oneapi/setvars.sh --include-intel-llvm
    
    

    For private installations (in the default location):

    . ~/intel/oneapi/setvars.sh --include-intel-llvm
    
    
  2. Ensure that the HIP libraries and tools can be found in your environment:

    1. Run rocminfo - if it runs without any obvious errors in the output then your environment should be set up correctly.

    2. Otherwise, set your environment variables manually:

      export PATH=/PATH_TO_ROCM_ROOT/bin:$PATH
      export LD_LIBRARY_PATH=/PATH_TO_ROCM_ROOT/lib:$LD_LIBRARY_PATH
      
      

      ROCm is commonly installed in /opt/rocm-x.x.x/.

Verify Your Installationlink

To verify the DPC++ HIP plugin installation, the DPC++ sycl-ls tool can be used to make sure that SYCL now exposes the available AMD GPUs. You should see something similar to the following in the sycl-ls output if AMD GPUs are found:

[hip:gpu][hip:0] AMD HIP BACKEND, AMD Radeon PRO W6800 gfx1030 [HIP 60140.9]

Run a Sample Applicationlink
  1. Create a file simple-sycl-app.cpp with the following C++/SYCL code:

    #include <sycl/sycl.hpp>
    
    int main() {
      // Creating buffer of 4 ints to be used inside the kernel code
      sycl::buffer<int, 1> Buffer{4};
    
      // Creating SYCL queue
      sycl::queue Queue{};
    
      // Size of index space for kernel
      sycl::range<1> NumOfWorkItems{Buffer.size()};
    
      // Submitting command group(work) to queue
      Queue.submit([&](sycl::handler &cgh) {
        // Getting write only access to the buffer on a device
        auto Accessor = Buffer.get_access<sycl::access::mode::write>(cgh);
        // Executing kernel
        cgh.parallel_for<class FillBuffer>(
            NumOfWorkItems, [=](sycl::id<1> WIid) {
              // Fill buffer with indexes
              Accessor[WIid] = static_cast<int>(WIid.get(0));
            });
      });
    
      // Getting read only access to the buffer on the host.
      // Implicit barrier waiting for queue to complete the work.
      auto HostAccessor = Buffer.get_host_access();
    
      // Check the results
      bool MismatchFound{false};
      for (size_t I{0}; I < Buffer.size(); ++I) {
        if (HostAccessor[I] != I) {
          std::cout << "The result is incorrect for element: " << I
                    << " , expected: " << I << " , got: " << HostAccessor[I]
                    << std::endl;
          MismatchFound = true;
        }
      }
    
      if (!MismatchFound) {
        std::cout << "The results are correct!" << std::endl;
      }
    
      return MismatchFound;
    }
    
    
  2. Compile the application with:

    icpx -fsycl -fsycl-targets=amdgcn-amd-amdhsa \
            -Xsycl-target-backend --offload-arch=<ARCH> \
            -o simple-sycl-app simple-sycl-app.cpp
    
    

    Where ARCH is the GPU architecture e.g. gfx1030, which you can check by running:

    rocminfo | grep 'Name: *gfx.*'
    
    

    You should see the GPU architecture in the output, for example:

      Name:                    gfx1030
    
    
  3. Run the application with:

    ONEAPI_DEVICE_SELECTOR="hip:*" SYCL_UR_TRACE=1 ./simple-sycl-app
    
    

    You should see output like:

    <LOADER>[INFO]: loaded adapter 0x0x43c050 (libur_adapter_hip.so.0)
    SYCL_UR_TRACE: Requested device_type: info::device_type::automatic
    SYCL_UR_TRACE: Selected device: -> final score = 1500
    SYCL_UR_TRACE:   platform: AMD HIP BACKEND
    SYCL_UR_TRACE:   device: AMD Instinct MI210
    The results are correct!
    
    

    If so, you have successfully set up and verified your oneAPI for AMD GPUs development environment, and you can begin developing oneAPI applications.

    The rest of this document provides general information on compiling and running oneAPI applications on AMD GPUs.

Use DPC++ to Target AMD GPUslink Compile for AMD GPUslink

To compile a SYCL application for AMD GPUs, use the icpx compiler provided with DPC++. For example:

icpx -fsycl -fsycl-targets=amdgcn-amd-amdhsa \
        -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1030 \
        -o sycl-app sycl-app.cpp

The following flags are required:

Note that when targeting an AMD GPU, the specific architecture of the GPU must be provided. It is not currently possible to build binaries that contain multiple architectures, though this might change in a future release.

For more information on available SYCL compilation flags, see the DPC++ Compiler User’s Manual or for information on all DPC++ compiler options see the Compiler Options section of the Intel oneAPI DPC++/C++ Compiler Developer Guide and Reference.

Using the icpx compilerlink

The icpx compiler is by default a lot more aggressive with optimizations than the regular clang++ driver, as it uses both -O2 and -ffast-math. In many cases this can lead to better performance but it can also lead to some issues for certain applications. In such cases it is possible to disable -ffast-math by using -fno-fast-math and to change the optimization level by passing a different -O flag. It is also possible to directly use the clang++ driver which can be found in $releasedir/compiler/latest/linux/bin-llvm/clang++, to get regular clang++ behavior.

Compile for Multiple Targetslink

In addition to targeting AMD GPUs, you can build SYCL applications that can be compiled once and then run on a range of hardware. The following example shows how to output a single binary including device code that can run on AMD GPUs, NVIDIA GPUs, or any device that supports SPIR e.g. Intel GPUs.

icpx -fsycl -fsycl-targets=spir64,amdgcn-amd-amdhsa,nvptx64-nvidia-cuda \
        -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1030 \
        -Xsycl-target-backend=nvptx64-nvidia-cuda --offload-arch=sm_80 \
        -o sycl-app sycl-app.cpp

The above command can be broken down into the following ingredients:

A binary compiled in the above way can successfully run the SYCL kernels on:

AOT compilation for Intel hardware is also possible in combination with AMD and NVIDIA targets, and can be achieved by using the spir64_gen target and the corresponding architecture flags. For example, to compile the above application AOT for the Ponte Vecchio Intel graphics architecture, the following command can be used:

icpx -fsycl -fsycl-targets=spir64_gen,amdgcn-amd-amdhsa,nvptx64-nvidia-cuda \
        -Xsycl-target-backend=spir64_gen '-device pvc' \
        -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1030 \
        -Xsycl-target-backend=nvptx64-nvidia-cuda --offload-arch=sm_80 \
        -o sycl-app sycl-app.cpp

Note the different syntax -device <arch> as compared to --offload-arch=<arch> which is required due to a different compiler toolchain being used for the Intel targets.

The compiler driver also offers alias targets for each target+architecture pair to make the command line shorter and easier to understand for humans. Thanks to the aliases, the -Xsycl-target-backend flags no longer need to be specified. The above command is equivalent to:

icpx -fsycl -fsycl-targets=intel_gpu_pvc,amd_gpu_gfx1030,nvidia_gpu_sm_80 \
        -o sycl-app sycl-app.cpp

The full list of aliases is documented in the DPC++ Compiler User’s Manual.

Run SYCL Applications on AMD GPUslink

After compiling your SYCL application for an AMD target, you should also ensure that the correct SYCL device representing the AMD GPU is selected at runtime.

In general, simply using the default device selector should select one of the available AMD GPUs. However in some scenarios, users may want to change their SYCL application to use a more precise SYCL device selector, such as the GPU selector, or even a custom selector.

Controlling AMD devices exposed to DPC++link

In certain circumstances it will be desirable for the user to enforce that only certain GPUs are available to a SYCL programming implementation such as DPC++. This is possible through some environment variables that will now be described. These environment variables also allow users to control the sharing of GPU resources within a shared GPU cluster.

Device selector env variableslink

The environment variable ONEAPI_DEVICE_SELECTOR may be used to restrict the set of devices that can be used. For example, to only allow devices exposed by the DPC++ HIP plugin, set the following:

export ONEAPI_DEVICE_SELECTOR="hip:*"

To only allow a subset of devices from the hip backend use a comma separated list, e.g.:

export ONEAPI_DEVICE_SELECTOR="hip:1,3"

Then the following will populate Devs with the two AMD devices only:

    std::vector<sycl::device> Devs;
    for (const auto &plt : sycl::platform::get_platforms()) {
      if (plt.get_backend() == sycl::backend::ext_oneapi_hip) {}
        Devs=plt.get_devices();
        break;
      }
    }

Such that Devs[0] and Devs[1] will correspond to the devices marked 1 and 3 respectively, when a user invokes rocm-smi.

For more details covering ONEAPI_DEVICE_SELECTOR, see the Environment Variables section of the oneAPI DPC++ Compiler documentation.

In the case that only AMD devices are exposed to DPC++ the above described usage of ONEAPI_DEVICE_SELECTOR is equivalent to setting the HIP_VISIBLE_DEVICES environment variable:

export HIP_VISIBLE_DEVICES=1,3

In this circumstance that only AMD GPUs are available, an identical list can be populated in a simpler way:

    std::vector<sycl::device> Devs =
      sycl::device::get_devices(sycl::info::device_type::gpu);

Then Devs[0] and Devs[1] will correspond to the devices marked 1 and 3 by rocm-smi.

DPC++ Resourceslink SYCL Resourceslink

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4