This code properly works on Intel Core i7 and GeForce GTX 650 Ti system both on CPU and GPU with "-Minline" either with or without "-fast" PGI compiler flag.
I tried to compile it on Xeon Knights Landing (KNL) with Nvidia TITAN V GPU installed in KNL through PCI-Express.
I use the compile line written in README.md of the repository.
CUDA Toolkit 9.2 is installed on the system, the OS is Debian, x86_64. PGI 19.4 pgc++ compiler.
But the compilation fails with an error (ERROR.txt in the repository):
Code: Select all
... pgc++-Error-CUDA Fortran is not supported on Knights Landing host systems pgc++-Error-OpenACC for Tesla GPU targets is not supported on Knights Landing host systems
I used to compile a similar simpler and smaller C++ code (using cudaMemcpy() and compiler flags "-Mnollvm -Mcuda8.0") for launching on Nvidia TITAN V GPU fixed in KNL using OpenAcc and PGI 18.x pgc++ compilers and had no problem.
What is the reason for the error?
And what a way out can be in such a situation?
It is a disaster for me, because my task is to compare the performance of CPU-optimized code with the GPU-optimized code, which i am working at now.