Clarification on using OpenACC in a shared library

OpenACC and CUDA Fortran
DavidGutzwiller
Posts: 10
Joined: Jan 30 2015

Clarification on using OpenACC in a shared library

Post by DavidGutzwiller » Fri Sep 27, 2019 5:08 pm

Hello,

I am reworking an existing standalone C/C++/OpenACC CFD solver as a shared library for inclusion in a larger application. From a high level, it looks like this:
main application, compiled and linked with gcc or PGI.
-- geometry, mesh, graphics, etc libraries accessed with dlopen, all compiled with gcc.
-- solver library accessed with dlopen, compiled with PGI and OpenACC support.

A first test with OpenACC disabled was successful, the solver library ran as expected. However, when I compile with OpenACC enabled I encounter some isses. The solver starts up as expected and nvidia-smi shows the expected number of MPI processes assigned to the targeted device (set with CUDA_VISIBLE_DEVICES). However, the application hangs at the first acc API call (acc_get_memory() in this case).
I have seen other postings indicating that the -ta=tesla:nordc flag must be used when preparing a shared library. Is this still the recommended solution for PGI 19.X? This is unfortunately a problem for my solver, which is a big C/C++ monster with extensive use of acc routine and global variables.
A fallback would be to link in the solver statically but that would go against the larger design philosophy of this application. I also intend to try inlining but it is going to be a lot of inlined code which makes me a bit wary.

Thanks for you help,
-David

brentl
Posts: 256
Joined: Jul 20 2004

Re: Clarification on using OpenACC in a shared library

Post by brentl » Mon Sep 30, 2019 8:26 am

The issue requiring nordc in shared libraries was addressed in PGI 19.1, so if you use any 2019 PGI compiler, you should not have to worry about that.

Can you link with pgc++? We will put the proper init section in, in that case. If not, I think there is still a way. Mat is the expert in this area, unfortunately he is out this week. I'll do a little checking.

DavidGutzwiller
Posts: 10
Joined: Jan 30 2015

Re: Clarification on using OpenACC in a shared library

Post by DavidGutzwiller » Mon Sep 30, 2019 10:38 am

Thanks for the response.

Here is where I currently stand:

If I compile the solver library without nordc and link with pgc++ there is a hang at the first acc_* API call. If I compile and link with pgc++ with nordc the acc_* API calls work. However, the solver library then crashes at the first parallel region. I had to strip out a lot of the code to make this compile at all with nordc, so it is possible I broke something on the way. I'll continue with this and return to this thread when I have more details.

Could you elaborate on what you mean by "We will put the proper init section in". Is this somehow related to acc_init()? When you say "link with pgc++", are you referring to the solver library or the parent application?

-David

brentl
Posts: 256
Joined: Jul 20 2004

Re: Clarification on using OpenACC in a shared library

Post by brentl » Mon Sep 30, 2019 10:57 am

I mean the parent application. An init section is something that gets called when the program is loaded, before "main" is called.

DavidGutzwiller
Posts: 10
Joined: Jan 30 2015

Re: Clarification on using OpenACC in a shared library

Post by DavidGutzwiller » Tue Oct 01, 2019 9:15 am

Another update:

After some more trial and error I was able to get acc_ API calls and a few simple parallel loops working from the solver library if I do the following:

- Remove all pragma acc declare statements for global variables
- Compile and link the solver library with pgc++ using the nordc option.
- Link the parent application with pgc++

The nordc argument seems to be critical, at least for my application. I'm building with PGI 19.4, do you think it is worth upgrading to a newer version?

Moving on to pragma acc routine: is there a way to force pgc++ to inline specific functions via a preprocessing macro? I'm dealing with a complicated make system that makes it messy to force inlining through the -Minline command line arguments.

Thanks,

-David

Post Reply