Mcudalib=cublas static linking issue

OpenACC and CUDA Fortran
Post Reply
rerhy
Posts: 14
Joined: Jul 25 2019

Mcudalib=cublas static linking issue

Post by rerhy » Wed Dec 04, 2019 6:57 am

Hi,

I use the cublas library in my code. Therefore I use the -Mcudalib=cublas flag to include the libraries. If I link against the shared version of the library it works. However, if I link against the static version
(-Bstatic --> libcublas_static.a) I get some missing references because the libcublas also needs libcublasLt which seems not to be properly provided with the static flag. As far as I know the additional
-Mcudalib=cublasLt is not supported so shouldn't the libcublasLt also automatically be provided by using the -Mcudalib=cublas flag?

Many thanks.
Regards,
Reto

mkcolg
Posts: 8137
Joined: Jun 30 2004

Re: Mcudalib=cublas static linking issue

Post by mkcolg » Wed Dec 04, 2019 11:03 am

Hi Reto,

We actually do implicitly include cublasLt_static.a on the link when statically linking. The problem here is that even the static versions of cuBLAS (as well as other CUDA libraries) include some dynamic loads which then cause undefined reference errors (to things like dlopen, dlclose, dlsym, etc.), hence you can't create a completely static executable.

Instead of "-Bstatic", try using the "-Bstatic_pgi" flag. In this case, we'll link the PGI runtime as well as the CUDA libraries statically, but link the system libraries dynamically.

For example (note that I'm using the verbose "-v" flag, to show what the actual link line looks like):

Code: Select all

% pgfortran -Mcuda -Mcudalib=cublas -v test_cublas.o -Bstatic_pgi -V19.10
Export PGI_CURR_CUDA_HOME=/proj/pgi/linux86-64-llvm/2019/cuda/10.1
Export PGI=/proj/pgi

/proj/pgi/linux86-64-llvm/19.10/bin/pgacclnk -nvidia /proj/pgi/linux86-64-llvm/19.10/bin/pgnvd -cuda10010 -cudaroot /proj/pgi/linux86-64-llvm/2019/cuda/10.1 -cudalink -computecap=70 -v /usr/bin/ld /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o /proj/pgi/linux86-64-llvm/19.10/lib/trace_init.o /usr/lib/gcc/x86_64-linux-gnu/7/crtbegin.o /proj/pgi/linux86-64-llvm/19.10/lib/f90main.o --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /proj/pgi/linux86-64-llvm/19.10/lib/pgi.ld -L/proj/pgi/linux86-64-llvm/2019/cuda/10.1/lib64 -L/proj/pgi/linux86-64-llvm/19.10/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-linux-gnu/7 test_cublas.o -rpath /proj/pgi/linux86-64-llvm/19.10/lib -rpath /proj/pgi/linux86-64-llvm/2019/cuda/10.1/lib64 -rpath /usr/lib/gcc/x86_64-linux-gnu/7/../../../../lib64 -L/usr/lib/gcc/x86_64-linux-gnu/7/../../../../lib64 -Bstatic -Bdynamic -Bstatic -lcudafor101 -lcudafor -lcudaforblas /proj/pgi/linux86-64-llvm/19.10/lib/cuda_init_register_end.o -Bdynamic -Bstatic -lcublas_static -lcublasLt_static -lculibos -lcudaforblas -Bdynamic -L/proj/pgi/linux86-64-llvm/2019/cuda/10.1/lib64 -Bstatic -lcudadevrt -lcudart_static -Bdynamic -ldl -Bstatic -lcudafor2 -Bdynamic -Bstatic -lpgf90rtl -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lpgatm -lpgkomp -lomp -as-needed -lomptarget -no-as-needed -Bdynamic -Bstatic -Bdynamic -lpthread -Bstatic --start-group -lpgmath -lnspgc -lpgc --end-group -Bdynamic -lrt -lpthread -lm -lgcc -lc -lgcc -lgcc_s -lstdc++ /usr/lib/gcc/x86_64-linux-gnu/7/crtend.o /usr/lib/x86_64-linux-gnu/crtn.o
/proj/pgi/linux86-64-llvm/19.10/bin/pgnvd -dcuda /proj/pgi/linux86-64-llvm/2019/cuda/10.1 /proj/pgi/linux86-64-llvm/19.10/lib/trace_init.o /proj/pgi/linux86-64-llvm/19.10/lib/f90main.o -L/proj/pgi/linux86-64-llvm/2019/cuda/10.1/lib64 -L/proj/pgi/linux86-64-llvm/19.10/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-linux-gnu/7 test_cublas.o -L/usr/lib/gcc/x86_64-linux-gnu/7/../../../../lib64 -lcudafor101 -lcudafor -lcudaforblas /proj/pgi/linux86-64-llvm/19.10/lib/cuda_init_register_end.o -lcublas_static -lcublasLt_static -lculibos -L/proj/pgi/linux86-64-llvm/2019/cuda/10.1/lib64 -lcudadevrt -lcudart_static -ldl -lcudafor2 -lpgf90rtl -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgftnrtl -lpgatm -lpgkomp -lomp -lomptarget -lpthread -lpgmath -lnspgc -lpgc -lrt -lpthread -lm -lgcc -lc -lgcc_s -lstdc++ -dolink -cuda10010 -computecap 70 -o /tmp/pgcuda_Yec6HXaVL0G.cubin -regobj /tmp/pgcudaregoYecQ8Hv_1fl.o
/proj/pgi/linux86-64-llvm/19.10/bin/pgnvd -fatobj /tmp/pgcudafatUYeckX99qtzm.o -o /tmp/pgcudafatUYeckX99qtzm.o -cuda10010 -dcuda /proj/pgi/linux86-64-llvm/2019/cuda/10.1 -cudalink -sm 70 /tmp/pgcuda_Yec6HXaVL0G.cubin
/usr/bin/ld /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o /tmp/pgcudafatUYeckX99qtzm.o /tmp/pgcudaregoYecQ8Hv_1fl.o /proj/pgi/linux86-64-llvm/19.10/lib/trace_init.o /usr/lib/gcc/x86_64-linux-gnu/7/crtbegin.o /proj/pgi/linux86-64-llvm/19.10/lib/f90main.o --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /proj/pgi/linux86-64-llvm/19.10/lib/pgi.ld -L/proj/pgi/linux86-64-llvm/2019/cuda/10.1/lib64 -L/proj/pgi/linux86-64-llvm/19.10/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-linux-gnu/7 test_cublas.o -rpath /proj/pgi/linux86-64-llvm/19.10/lib -rpath /proj/pgi/linux86-64-llvm/2019/cuda/10.1/lib64 -rpath /usr/lib/gcc/x86_64-linux-gnu/7/../../../../lib64 -L/usr/lib/gcc/x86_64-linux-gnu/7/../../../../lib64 -Bstatic -Bdynamic -Bstatic -lcudafor101 -lcudafor -lcudaforblas /proj/pgi/linux86-64-llvm/19.10/lib/cuda_init_register_end.o -Bdynamic -Bstatic -lcublas_static -lcublasLt_static -lculibos -lcudaforblas -Bdynamic -L/proj/pgi/linux86-64-llvm/2019/cuda/10.1/lib64 -Bstatic -lcudadevrt -lcudart_static -Bdynamic -ldl -Bstatic -lcudafor2 -Bdynamic -Bstatic -lpgf90rtl -lpgf90 -lpgf90_rpm1 -lpgf902 -lpgf90rtl -lpgftnrtl -lpgatm -lpgkomp -lomp -as-needed -lomptarget -no-as-needed -Bdynamic -Bstatic -Bdynamic -lpthread -Bstatic --start-group -lpgmath -lnspgc -lpgc --end-group -Bdynamic -lrt -lpthread -lm -lgcc -lc -lgcc -lgcc_s -lstdc++ /usr/lib/gcc/x86_64-linux-gnu/7/crtend.o /usr/lib/x86_64-linux-gnu/crtn.o
Hope this helps,
Mat

rerhy
Posts: 14
Joined: Jul 25 2019

Re: Mcudalib=cublas static linking issue

Post by rerhy » Fri Dec 06, 2019 1:28 am

Hi Mat,

thanks for your reply. Actually, I just realized that we do use -Bstatic_pgi in case of the PGI compiler. I get still tons of undefined references. Below I attached just a small selection. Maybe it already rings a bell? However, if I add the libcublas_static and libcublasLt_static by hand it seems to work fine.

Regards,
Reto

Code: Select all

/remote/tcadprod/depot/linux/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas_static.a(cublas.o): In function `__static_initialization_and_destruction_0(int, int)':
cublas.compute_75.cudafe1.cpp:(.text+0x1ac): undefined reference to `CublasGPVar::GPVar::GPVar(char const*, int)'
/remote/tcadprod/depot/linux/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas_static.a(cublas.o): In function `cublasSetBackdoor':
cublas.compute_75.cudafe1.cpp:(.text+0x257): undefined reference to `CublasGPVar::GPVar::SetValue(char const*, char, void*)'
/remote/tcadprod/depot/linux/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas_static.a(cublas.o): In function `cublasGetBackdoor':
cublas.compute_75.cudafe1.cpp:(.text+0x289): undefined reference to `CublasGPVar::GPVar::GetValue(char const*, char, void*)'
/remote/tcadprod/depot/linux/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas_static.a(cublas.o): In function `cublasCtxInit(cublasContext**)':
cublas.compute_75.cudafe1.cpp:(.text+0x33b): undefined reference to `cublasFixedSizePoolWithGraphSuppport::cublasFixedSizePoolWithGraphSuppport()'
cublas.compute_75.cudafe1.cpp:(.text+0x343): undefined reference to `cublasFixedSizePoolWithGraphSuppport::cublasFixedSizePoolWithGraphSuppport()'
cublas.compute_75.cudafe1.cpp:(.text+0x34b): undefined reference to `cublasLtCtxInit'
cublas.compute_75.cudafe1.cpp:(.text+0x39b): undefined reference to `cublasFixedSizePoolWithGraphSuppport::init(cublasContext*, int, int)'
cublas.compute_75.cudafe1.cpp:(.text+0x3bd): undefined reference to `cublasFixedSizePoolWithGraphSuppport::init(cublasContext*, int, int)'
cublas.compute_75.cudafe1.cpp:(.text+0x417): undefined reference to `init_gemm_select'
/remote/tcadprod/depot/linux/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas_static.a(cublas.o): In function `cublasGetProperty':
cublas.compute_75.cudafe1.cpp:(.text+0x2257): undefined reference to `cublasLtGetProperty'
/remote/tcadprod/depot/linux/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas_static.a(cublas.o): In function `cublasGetVersion_v2':
cublas.compute_75.cudafe1.cpp:(.text+0x484a): undefined reference to `cublasLtGetVersion'
/remote/tcadprod/depot/linux/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas_static.a(cublas.o): In function `cublasDestroy_v2':
cublas.compute_75.cudafe1.cpp:(.text+0x4fac): undefined reference to `cublasFixedSizePoolWithGraphSuppport::tearDown()'
cublas.compute_75.cudafe1.cpp:(.text+0x4fb4): undefined reference to `cublasFixedSizePoolWithGraphSuppport::tearDown()'
cublas.compute_75.cudafe1.cpp:(.text+0x4fc9): undefined reference to `cublasLtShutdownCtx'
cublas.compute_75.cudafe1.cpp:(.text+0x4fd1): undefined reference to `cublasFixedSizePoolWithGraphSuppport::~cublasFixedSizePoolWithGraphSuppport()'
cublas.compute_75.cudafe1.cpp:(.text+0x4fd9): undefined reference to `cublasFixedSizePoolWithGraphSuppport::~cublasFixedSizePoolWithGraphSuppport()'
cublas.compute_75.cudafe1.cpp:(.text+0x5064): undefined reference to `free_gemm_select'


mkcolg
Posts: 8137
Joined: Jun 30 2004

Re: Mcudalib=cublas static linking issue

Post by mkcolg » Fri Dec 06, 2019 8:28 am

Hi Reto,

As you can see from the verbose output I posted above "-lcublas_static -lcublasLt_static" are being included. What's happening in you case is unclear.

Can you do something similar where you post you're link command with the verbose flag (-v), so we can see what libraries are being included?

Thanks,
Mat

Post Reply