PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

OpenMP "Signalled ACCESS_VIOLATION"

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
dcwarren



Joined: 18 Jun 2012
Posts: 29

PostPosted: Thu Feb 28, 2013 11:36 pm    Post subject: OpenMP "Signalled ACCESS_VIOLATION" Reply with quote

Hello again,

I'm still searching for a way to parallelize my code; Cuda and OpenACC both had issues, so I'm looking at OpenMP (running version 13.2 of PGI Fortran on Windows 7). But the Universe isn't making it easy on me.

If I compile my code with the following command
Code:
pgfortran -o code.exe -mp -Minfo=mp code.f90

then the code crashes before it gets to the parallel region. In fact, it crashes during variable initialization for a called subroutine (inlining doesn't change this). If I run the code in PGI's debugger, I get the following error message:
Quote:

Signalled ACCESS_VIOLATION at 0x1400E4E4C, function _builtin_stinit

Which points me to the following line of assembly code (if this even matters):
Code:
  50                        pushq  %rax

Further observations:
  • If I leave off the OpenMP tags everything works fine.
  • I can remove all code associated with OpenMP, including the !$omp lines themselves, and still get this behavior if I compile with the -mp flag.
  • I still get this error even after I've increased my stack size to 512MB using the (DOS) command "set OMP_STACKSIZE=512M".

What's going on here?

Edit: Is this even the right forum for this question? Should I have posted in "Programming and Compiling"?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Fri Mar 01, 2013 10:27 am    Post subject: Reply with quote

Quote:
What's going on here?
It looks like a stack overflow to me given the segv occurs when pushing a value on the stack.

On Windows, you need to set the stack size at link time using the "-stack" flag.
Code:
PGI$ pgf90 -help -stack
Reading rcfile C:\PROGRA~1\PGI\win64\13.2\bin\pgf90_rc
-stack=[no]check|<reserve>|<commit>
                    Set stack reserve and commit sizes at link time
    [no]check       Disable run-time stack check


You may need to experiment with the exact size to use, but this is the setting I usually start with:

Quote:
-stack=nocheck,39000000,39000000


The "nocheck" sub-option may improve your performance a bit at the cost of reserving the entire commit space upon load rather than incrementally adding it as needed during run time.

Hope this helps,
Mat
Back to top
View user's profile
dcwarren



Joined: 18 Jun 2012
Posts: 29

PostPosted: Mon Mar 04, 2013 9:58 am    Post subject: Reply with quote

mkcolg wrote:

Quote:
-stack=nocheck,39000000,39000000

Hope this helps,
Mat


The local IT guy told me that he thought OpenMP did something weird with the stack, and after some testing I'm inclined to agree.

Using the -mp flag, I need to reserve somewhere between 500MB and 600MB for the stack in order to not get access violation errors at runtime. This is for a Monte Carlo code whose largest array is 100K elements. Admittedly, there are 13 of them, but that doesn't add up to 500MB. Do you have any wisdom you can pass on to me?

Even once I've reserved enough space, I'm also seeing weirdness with how common block variables are handled with the OpenMP flag. If I use "-mp", a particular common block variable loses its value between the main program and a called subroutine. Without the "-mp" flag, everything works as expected. Any thoughts on this also?
Back to top
View user's profile
dcwarren



Joined: 18 Jun 2012
Posts: 29

PostPosted: Mon Mar 04, 2013 3:11 pm    Post subject: Reply with quote

dcwarren wrote:
Even once I've reserved enough space, I'm also seeing weirdness with how common block variables are handled with the OpenMP flag. If I use "-mp", a particular common block variable loses its value between the main program and a called subroutine. Without the "-mp" flag, everything works as expected. Any thoughts on this also?


Quoting myself because I may have found the issue. This variable was declared private at the start of the OpenMP block and set during the OpenMP block. However, the common block sits outside that region, and so the instance of that variable in the common block is not updated. What the subroutine sees, then, is the uninitialized copy from the common block rather than the initialized copy of that thread's OpenMP region.

Easy enough to test, but I don't have time right now.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group