15d94af91SJunchao Zhang#!/usr/bin/python3 25d94af91SJunchao Zhang 33ab125cbSJunchao Zhang# Use GNU compilers: 45d94af91SJunchao Zhang# 55d94af91SJunchao Zhang# Note cray-libsci provides BLAS etc. In summary, we have 6*1166ac79SJunchao Zhang# module use /soft/modulefiles 7*1166ac79SJunchao Zhang# module unload darshan 8*1166ac79SJunchao Zhang# module load cudatoolkit-standalone/12.4.1 PrgEnv-gnu cray-libsci 93ab125cbSJunchao Zhang# 105d94af91SJunchao Zhang# $ module list 115d94af91SJunchao Zhang# Currently Loaded Modules: 12*1166ac79SJunchao Zhang# 1) libfabric/1.15.2.0 6) nghttp2/1.57.0-ciat5hu 11) cray-dsmml/0.2.2 16) craype-x86-milan 13*1166ac79SJunchao Zhang# 2) craype-network-ofi 7) curl/8.4.0-2ztev25 12) cray-mpich/8.1.28 17) PrgEnv-gnu/8.5.0 14*1166ac79SJunchao Zhang# 3) perftools-base/23.12.0 8) cmake/3.27.7 13) cray-pmi/6.1.13 18) cray-libsci/23.12.5 15*1166ac79SJunchao Zhang# 4) gcc-native/12.3 9) cudatoolkit-standalone/12.4.1 14) cray-pals/1.3.4 16*1166ac79SJunchao Zhang# 5) spack-pe-base/0.6.1 10) craype/2.7.30 15) cray-libpals/1.3.4 175d94af91SJunchao Zhang 185d94af91SJunchao Zhangif __name__ == '__main__': 195d94af91SJunchao Zhang import sys 205d94af91SJunchao Zhang import os 215d94af91SJunchao Zhang sys.path.insert(0, os.path.abspath('config')) 225d94af91SJunchao Zhang import configure 235d94af91SJunchao Zhang configure_options = [ 245d94af91SJunchao Zhang '--with-cc=cc', 255d94af91SJunchao Zhang '--with-cxx=CC', 265d94af91SJunchao Zhang '--with-fc=ftn', 275d94af91SJunchao Zhang '--with-debugging=0', 285d94af91SJunchao Zhang '--with-cuda', 295d94af91SJunchao Zhang '--with-cudac=nvcc', 305d94af91SJunchao Zhang '--with-cuda-arch=80', # Since there is no easy way to auto-detect the cuda arch on the gpu-less Polaris login nodes, we explicitly set it. 315d94af91SJunchao Zhang '--download-kokkos', 325d94af91SJunchao Zhang '--download-kokkos-kernels', 33*1166ac79SJunchao Zhang '--download-hypre', 345d94af91SJunchao Zhang ] 355d94af91SJunchao Zhang configure.petsc_configure(configure_options) 365d94af91SJunchao Zhang 373ab125cbSJunchao Zhang# Use NVHPC compilers 383ab125cbSJunchao Zhang# 393ab125cbSJunchao Zhang# Unset so that cray won't add -gpu to nvc even when craype-accel-nvidia80 is loaded 403ab125cbSJunchao Zhang# unset CRAY_ACCEL_TARGET 413ab125cbSJunchao Zhang# module load nvhpc/22.11 PrgEnv-nvhpc 423ab125cbSJunchao Zhang# 433ab125cbSJunchao Zhang# I met two problems with nvhpc and Kokkos (and Kokkos-Kernels) 4.2.0. 443ab125cbSJunchao Zhang# 1) Kokkos-Kernles failed at configuration to find TPL cublas and cusparse from NVHPC. 453ab125cbSJunchao Zhang# As a workaround, I just load cudatoolkit-standalone/11.8.0 to let KK use cublas and cusparse from cudatoolkit-standalone. 463ab125cbSJunchao Zhang# 2) KK failed at compilation 473ab125cbSJunchao Zhang# "/home/jczhang/petsc/arch-kokkos-dbg/externalpackages/git.kokkos-kernels/batched/dense/impl/KokkosBatched_Gemm_Serial_Internal.hpp", line 94: error: expression must have a constant value 483ab125cbSJunchao Zhang# constexpr int nbAlgo = Algo::Gemm::Blocked::mb(); 493ab125cbSJunchao Zhang# ^ 503ab125cbSJunchao Zhang# "/home/jczhang/petsc/arch-kokkos-dbg/externalpackages/git.kokkos-kernels/blas/impl/KokkosBlas_util.hpp", line 58: note: cannot call non-constexpr function "__builtin_is_device_code" (declared implicitly) 513ab125cbSJunchao Zhang# KOKKOS_IF_ON_HOST((return 4;)) 523ab125cbSJunchao Zhang# ^ 533ab125cbSJunchao Zhang# detected during: 543ab125cbSJunchao Zhang# 553ab125cbSJunchao Zhang# It is a KK problem and I have to wait for their fix. 56