Entropy Space

  • 首页
  • 文章
    • 基准测试
    • 瞎折腾硬件
    • 杂谈
    • GROMACS
    • AMBER
  • 论坛回帖
  • bilibili
  • 知乎
  • ORCID
  • LiuLab
  • 关于
ア熵增焓减ウ『Entropy』的个人Blog。
  1. 首页
  2. 瞎折腾硬件
  3. 正文

Compatibility notes and troubleshooting guides for mainstream molecular dynamics simulation Apps on AMD’s consumer GPUs - Switch to AMD [Part Ⅱ]

2023-08-29 1797点热度 0人点赞 0条评论

Aug-2023 by ア熵增焓减ウ | yult-entropy@qq.com | entropylt@163.com

It's everyone's duty to squash the green behemoth.

0  Introduction

Please be aware that the information provided in this blog post is not comprehensive and should be considered as a secondary outcome of my recent MD performance benchmarks on AMD GPUs. You may be able to easily compile and run the applications covered in this blog post using older versions of ROCm and specific GPUs. It is also possible that some of the compatibility issues mentioned in this blog post will be resolved in future software updates.

It's important to note that the content of this blog post is not included in the official documentation, manual, readme, or wiki of the corresponding applications. Moreover, finding ready-made solutions on the Internet for the issues mentioned in this post may be extremely challenging. However, this does not mean that the official documentation is not valuable. On the contrary, carefully reading and understanding the official documentation, sentence by sentence, will greatly enhance your understanding of this blog post.

  • OS details: Ubuntu 22.04.3 LTS, Linux 6.2.0-26-generic x86_64, GNU 11.4.0
  • GPU architectures (codenames) covered: GCN 5.1 (gfx906), RDNA 2 (gfx1030), and RDNA 3 (gfx1100)
  • ROCm versions (refer to radeon.com) covered: 5.4.6, 5.5.3, and 5.6.0.

Please note that ROCm 5.4.6 is the final version bundled with LLVM 15, whereas subsequent versions of ROCm bundle LLVM 16, which is currently undergoing rapid updates. These updates may introduce compiler compatibility issues. However, if you are able to successfully compile and run the applications using the latest version of ROCm, you may potentially achieve improved performance.

1  GROMACS 2023.2 – OpenSYCL (formerly known as hipSYCL)

With ROCm 5.4.6 and OpenSYCL-0.9.4, it is possible to compile and run GROMACS directly on gfx906 or gfx1030 GPUs. For detailed instructions, please refer to the GROMACS 2023.2 Manual, specifically pages 15-17.

When using ROCm 5.5.3 or 5.6.0 with OpenSYCL-develop 25Jul2023, you can also compile and run GROMACS. To successfully compile OpenSYCL-develop, simply add '-DWITH_SSCP_COMPILER=OFF' to the CMake command. It's worth noting that the OpenSYCL develop branch is currently undergoing changes from 'hipSYCL' to 'OpenSYCL' in the source code. As a result, when compiling GROMACS based on it, there may be some warnings in the CMake configuration logs. However, these warnings should not cause any problems.

In terms of performance, GROMACS compiled using ROCm 5.6.0 and OpenSYCL-develop 25Jul2023 exhibits significant improvements compared to previous versions. Additionally, it is free of bugs when tested on gfx906 and gfx1030 GPUs. However, when it comes to the gfx1100 (RDNA 3) GPU, operation stability is a concern across all three versions of ROCm. Specifically, performance fluctuations and a high probability of mdrun getting stuck after running for a period of time have been observed (similar feedback was reported on the GROMACS forum in June of this year). Furthermore, GPU status information cannot be recognized by rocm-smi in this case.

2  Amber 22 – AmberTools 22 – Amber 22 HIP Patch 3Jan2023

With ROCm 5.4.6 or 5.5.3, it is indeed possible to compile and run Amber 22 on gfx906, gfx1030, and gfx1100 GPUs without encountering any bugs during testing. However, it is important to make specific modifications to the source code as outlined below:

1) Remove the 3rd TIP-TODO in src/pmemd/src/cuda/ptxmacros.h (line 130-170).

2) When compiling for GPU architectures that do not exist on the local machine, it is necessary to add the 'AMDGPU_TARGETS' and 'GPU_TARGETS' variables to the CMake command in the 'compile_with_hip.sh' file. This will enable optimizations targeting those specific GPU architectures. Note that multiple targets can be set simultaneously.

3) For RDNA GPUs,include '-D HIP_WARP64=OFF' to the CMake command and check line 85 of src/pmemd/src/cuda/ptxmacros.h. Add additional codes as needed, such as '|| defined(__gfx1100__)' for the 7900XTX GPU.

In terms of performance, Amber 22 compiled using ROCm 5.5.3 exhibits significant improvements compared to ROCm 5.4.6. However, no successful compilation method has been found using ROCm 5.6.0.

3  OpenMM 8.0.0 – OpenMM HIP Plugin 8Mar2023

With ROCm 5.4.6, 5.5.3, or 5.6.0, it is indeed possible to compile and run OpenMM directly on gfx906, gfx1030, and gfx1100 GPUs. During testing, occasional instances of GPU scheduling inactivity were observed on the gfx1100 GPU, while no bugs were encountered in the rest of the tests.

When compiling for GPU architectures that are not present on the local machine, it is necessary to include the 'AMDGPU_TARGETS' and 'GPU_TARGETS' variables in the CMake command. This will enable optimizations targeting those specific GPU architectures. Note that multiple targets can be set simultaneously.

In terms of performance, there is a sequential increase in the performance of OpenMM when compiled using the three versions of ROCm (with the default VkFFT backend). The performance difference becomes more apparent for smaller systems.

4  LAMMPS 2Aug2023 – Kokkos

With ROCm 5.4.6, 5.5.3, or 5.6.0, it is indeed possible to compile and run LAMMPS on gfx906, gfx1030, and gfx1100 GPUs without encountering any bugs during testing.

For RDNA GPUs, certain modifications are required. Specifically, you need to replace the official bundle of Kokkos (lib/Kokkos) with the latest version of Kokkos (4.1.0). Additionally, you need to change "14" to "17" in lines 146 and 147 of cmake/CMakeLists.txt.

During the CMake configuration step, you need to specify the necessary packages in cmake/presets/basic.cmake. Furthermore, you should specify the GPU architecture code in cmake/presets/kokkos-hip.cmake, keeping in mind that only one GPU architecture can be specified. For a list of code mappings, refer to lib/kokkos/cmake/kokkos_arch.cmake.

In terms of performance, there is a sequential increase in the performance of LAMMPS when compiled using the three versions of ROCm. The performance difference becomes more apparent for smaller systems.

5  Conclusion

The compatibility lists of the four Apps are as follows:

In terms of absolute performance, cost-effectiveness, and compatibility, both OpenMM and LAMMPS are currently suitable for ordinary users to fully "switch to AMD". Additionally, GROMACS and Amber users can also start to try to "switch to AMD". ROCm 5.4.6 or 5.5.3, coupled with a GCN 5.1 or RDNA 2 GPU, offers seamless compatibility with all four applications mentioned in this blog post, while ROCm 5.6.0 provides the best performance. Additionally, in some cases, minor modifications to the source code of the applications may be necessary.

It is worth mentioning that as of the time of this blog post, the latest version of ROCm (5.6.0) does not officially support RDNA 3 GPUs. According to AMD's official notification, RDNA 3 GPUs will receive official support sometime this fall. Therefore, any specific issues related to RDNA 3 in GROMACS may only be temporary and are expected to be resolved in the future (I hope so).

本作品采用 知识共享署名 4.0 国际许可协议 进行许可
标签: Benchmark 分子动力学 计算机硬件
最后更新:2023-08-29

Entropy

- 虚拟快感是人类文明的下一个大过滤器 -

点赞
< 上一篇
下一篇 >

文章评论

razz evil exclaim smile redface biggrin eek confused idea lol mad twisted rolleyes wink cool arrow neutral cry mrgreen drooling persevering
取消回复

Entropy

- 虚拟快感是人类文明的下一个大过滤器 -

分类
  • AMBER / 1篇
  • GROMACS / 3篇
  • 基准测试 / 6篇
  • 杂谈 / 10篇
  • 瞎折腾硬件 / 17篇
最新 热点 随机
最新 热点 随机
玩乐中完成的3套小型HPC,以及一些HPC/AI Infra相关的碎碎念 手搓高性价比GPU集群 DIY NAS小记 - 实验室团队版 Xeon GNR-AP终于不再明显落后于EPYC了 锐评Blackewell GPU 分子动力学工作站配置推荐:2023双十一
锐评Blackewell GPU 主流分子动力学程序在消费级AMD GPU上的兼容性说明及疑难解答 | Switch to AMD【第2集】 与我的想法高度一致-转秃头咪蒙最新锐评 2021双十一MD工作站配置推荐 参加科音培训,从入门到发文章 2023年3月MD benchmark测试:最终章

COPYRIGHT © 2021-2023 enthalpy.space. ALL RIGHTS RESERVED.

Theme Kratos Made By Seaton Jiang

浙ICP备2021005617号