xref: /petsc/config/examples/arch-alcf-polaris.py (revision 3ab125cb9aa33c0d4b4a2faf546d7b936b9960ce)
15d94af91SJunchao Zhang#!/usr/bin/python3
25d94af91SJunchao Zhang
3*3ab125cbSJunchao Zhang# Use GNU compilers:
45d94af91SJunchao Zhang#
55d94af91SJunchao Zhang# module load cudatoolkit-standalone PrgEnv-gnu cray-libsci
65d94af91SJunchao Zhang#
75d94af91SJunchao Zhang# Note cray-libsci provides BLAS etc. In summary, we have
85d94af91SJunchao Zhang#
9*3ab125cbSJunchao Zhang# module load cudatoolkit-standalone/11.8.0 PrgEnv-gnu gcc/10.3.0 cray-libsci
10*3ab125cbSJunchao Zhang#
115d94af91SJunchao Zhang# $ module list
125d94af91SJunchao Zhang# Currently Loaded Modules:
13*3ab125cbSJunchao Zhang#   1) craype-x86-rome          5) craype-accel-nvidia80           9) cray-dsmml/0.2.2     13) PrgEnv-gnu/8.3.3
14*3ab125cbSJunchao Zhang#   2) libfabric/1.15.2.0       6) cmake/3.23.2                   10) cray-pmi/6.1.10      14) cray-libsci/23.02.1.1
15*3ab125cbSJunchao Zhang#   3) craype-network-ofi       7) cudatoolkit-standalone/11.8.0  11) cray-pals/1.2.11     15) gcc/10.3.0
16*3ab125cbSJunchao Zhang#   4) perftools-base/23.03.0   8) craype/2.7.20                  12) cray-libpals/1.2.11  16) cray-mpich/8.1.25
175d94af91SJunchao Zhang
185d94af91SJunchao Zhangif __name__ == '__main__':
195d94af91SJunchao Zhang  import sys
205d94af91SJunchao Zhang  import os
215d94af91SJunchao Zhang  sys.path.insert(0, os.path.abspath('config'))
225d94af91SJunchao Zhang  import configure
235d94af91SJunchao Zhang  configure_options = [
245d94af91SJunchao Zhang    '--with-cc=cc',
255d94af91SJunchao Zhang    '--with-cxx=CC',
265d94af91SJunchao Zhang    '--with-fc=ftn',
275d94af91SJunchao Zhang    '--with-debugging=0',
285d94af91SJunchao Zhang    '--with-cuda',
295d94af91SJunchao Zhang    '--with-cudac=nvcc',
305d94af91SJunchao Zhang    '--with-cuda-arch=80', # Since there is no easy way to auto-detect the cuda arch on the gpu-less Polaris login nodes, we explicitly set it.
315d94af91SJunchao Zhang    '--download-kokkos',
325d94af91SJunchao Zhang    '--download-kokkos-kernels',
335d94af91SJunchao Zhang  ]
345d94af91SJunchao Zhang  configure.petsc_configure(configure_options)
355d94af91SJunchao Zhang
36*3ab125cbSJunchao Zhang# Use NVHPC compilers
37*3ab125cbSJunchao Zhang#
38*3ab125cbSJunchao Zhang# Unset so that cray won't add -gpu to nvc even when craype-accel-nvidia80 is loaded
39*3ab125cbSJunchao Zhang# unset CRAY_ACCEL_TARGET
40*3ab125cbSJunchao Zhang# module load nvhpc/22.11 PrgEnv-nvhpc
41*3ab125cbSJunchao Zhang#
42*3ab125cbSJunchao Zhang# I met two problems with nvhpc and Kokkos (and Kokkos-Kernels) 4.2.0.
43*3ab125cbSJunchao Zhang# 1) Kokkos-Kernles failed at configuration to find TPL cublas and cusparse from NVHPC.
44*3ab125cbSJunchao Zhang#    As a workaround, I just load cudatoolkit-standalone/11.8.0 to let KK use cublas and cusparse from cudatoolkit-standalone.
45*3ab125cbSJunchao Zhang# 2) KK failed at compilation
46*3ab125cbSJunchao Zhang# "/home/jczhang/petsc/arch-kokkos-dbg/externalpackages/git.kokkos-kernels/batched/dense/impl/KokkosBatched_Gemm_Serial_Internal.hpp", line 94: error: expression must have a constant value
47*3ab125cbSJunchao Zhang#     constexpr int nbAlgo = Algo::Gemm::Blocked::mb();
48*3ab125cbSJunchao Zhang#                            ^
49*3ab125cbSJunchao Zhang# "/home/jczhang/petsc/arch-kokkos-dbg/externalpackages/git.kokkos-kernels/blas/impl/KokkosBlas_util.hpp", line 58: note: cannot call non-constexpr function "__builtin_is_device_code" (declared implicitly)
50*3ab125cbSJunchao Zhang#           KOKKOS_IF_ON_HOST((return 4;))
51*3ab125cbSJunchao Zhang#           ^
52*3ab125cbSJunchao Zhang#           detected during:
53*3ab125cbSJunchao Zhang#
54*3ab125cbSJunchao Zhang# It is a KK problem and I have to wait for their fix.
55