PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

pgi 11.8 and openmpi 1.4.3
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
franzisko



Joined: 11 Jan 2011
Posts: 25

PostPosted: Wed Nov 09, 2011 8:45 am    Post subject: pgi 11.8 and openmpi 1.4.3 Reply with quote

Hello,

I compiled openmpi 1.4.3 with different options (even with options suggested in pgi website) and pgi 11.8. I get a segfault when launching a hello world test program with any number of processors. sometimes the run goes ok. It happens on intel westmere while everything seems ok for amd barcelona cores

please help me because it is the only way to use cuda fortran for me!

thanks, Francesco
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Wed Nov 09, 2011 9:01 am    Post subject: Reply with quote

Hi Francesco,

Which system did you compile it on? Is the error actually an illegal instruction (sig 4)?

Can you run your program in the PGI debugger, pgdbg, an determine where the error occurs?

- Mat
Back to top
View user's profile
franzisko



Joined: 11 Jan 2011
Posts: 25

PostPosted: Wed Nov 09, 2011 11:01 am    Post subject: Reply with quote

Hi mat,

thanks for fast reply,

the error is
Code:
Signal: Segmentation fault (11). Signal code: Address not mapped (1). Failing at address (nil)
.

operating system is Scientific Linux SL release 5.7 (even for the other node which works fine)

launching (if it is correct)

Code:
mpirun -np 4 pgdbg a.out


and starting the four processes, it gives:

Code:
Signalled SIGSEGV at 0x2B694A82B6EF, function ___vsnprintf_chk, file interp.c line 1217


maybe, it occurs in libnuma.so.1.

thanks,
Francesco
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Wed Nov 09, 2011 2:23 pm    Post subject: Reply with quote

Hi Francesco,

Did you build OpenMPI on the Westmere system? If not, give that a try and see if there is some issue between the two systems.

I did have a similar issue with OpenMPI due to inconsistent compiler versions. The OpenMPI library I was using was built with PGI 10.9 while the application was built with 11.8. The differing runtime library version caused all applications to segv. Could a similar issue be occurring here? Run the "ldd" command on OpenMPI's mpiexec and your test program. Are the runtime libraries consistent?

- Mat
Back to top
View user's profile
franzisko



Joined: 11 Jan 2011
Posts: 25

PostPosted: Thu Nov 10, 2011 5:09 am    Post subject: Reply with quote

Hello Mat,

unfortunately even compiling on the same platform the segfault is still here. Everything (or at least Hello world example) seems fine using PGI 11.3 and even compiling with PGI 11.8 and then artificially using dynamic library from 11.3 release.

thanks, Francesco
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group