PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Coredump in __hpf_dealloc (compiled with pgf95 6.0)

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling
View previous topic :: View next topic  
Author Message
TobiasB



Joined: 09 Mar 2006
Posts: 5

PostPosted: Sat Mar 11, 2006 12:10 pm    Post subject: Coredump in __hpf_dealloc (compiled with pgf95 6.0) Reply with quote

Hello,

Using pgf95 6.0-5 64-bit target on x86-64 Linux, I get a coredump in my program. The debugger only spits out:

Loaded: /home/t/fleur-bin/fleur_r_debug.x core.31420
Signalled SIGSEGV at 0x2A9633CDD6, function __hpf_dealloc
2A9633CDD6: 48 39 6F 8 cmpq %rbp,8(%rdi)

Any idea how to debug? (Compiled with -g -r8 -Mbounds.)
The problem is that the program crashes after about 90 minutes of number crunching, which makes debugging not really easier.

Tobias

PS: Side question: In how far is the pgi 6.0 compatible with pgi 6.1?
Reason is: I'd like to use the AMD Math Core Library 3.1 (compiled with 6.1 and available next week) and our admins have only 6.0 installed. (If it only works with 6.1, I need to push them to upgrade ;-)
Back to top
View user's profile
TobiasB



Joined: 09 Mar 2006
Posts: 5

PostPosted: Sat Mar 11, 2006 2:08 pm    Post subject: Reply with quote

Googling a bit around, I found something similar in forum post
http://www.pgroup.com/userforum/viewtopic.php?t=234&sid=d52a6486349afee02a7e888486135bb3

That was with 6.0-4 (I have 6.0-5) and had the tracking number TPR #3551.
I tried that program with 6.0-5 and it still crashes. What is the current status about that issue?

* * *

I tried to run my program "fleur_r_debug.x" in pgdbg, but "where" does not give any further information after the crash.
However, "gdb" spits out more information (with "where"/"bt"):

#0 0x0000002a9633cdd6 in __hpf_dealloc () from /opt/testing/pgi-6/linux86-64/6.0/libso/libpgf90.so
#1 0x0000002a9633cf5c in pgf90_dealloc () from /opt/testing/pgi-6/linux86-64/6.0/libso/libpgf90.so
#2 0x000000000068cc23 in m_tlmplm_tlmplm_ () at tlmplm.F:324
#3 0x00000000008a2c25 in m_eigen_eigen_ () at eigen.F:434
#4 0x0000000000926360 in fleur () at fleur.F:842
#5 0x00000000004031aa in main ()

The line 324 is "END SUBROUTINE tlmplm" and the file only contains only arrays passed as INTENT(IN)/INTENT(OUT) and local arrays with a size depending on the passed arguments. No explicit ALLOCATE/DEALLOCATE is present.
Something must have gone wrong with the automatic deallocation for the local arrays, I presume. But I have not idea how to (a) workaround this problem or (b) to investigate this problem.

* * *

How come that gdb supports the nice "where" whereas "pgdbg" does not?

Regards,

Tobias
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Mon Mar 13, 2006 2:49 pm    Post subject: Reply with quote

Hi Tobais,

Quote:
That was with 6.0-4 (I have 6.0-5) and had the tracking number TPR #3551.
I tried that program with 6.0-5 and it still crashes. What is the current status about that issue?


TPR#3551 was fixed as of the 6.1-1 release. The problem was how the compiler was generating temporary arrays having a negative stride, which may or may not be the same as your error. Please either install the demo version of 6.1 (found here) or send a report to trs@pgroup.com.

Quote:
How come that gdb supports the nice "where" whereas "pgdbg" does not?

pgdbg has a problem with the stacktrace from within code not compiled with "-g". Our tools group is aware of the issue and should have it fixed later this year.

Quote:
PS: Side question: In how far is the pgi 6.0 compatible with pgi 6.1?
Reason is: I'd like to use the AMD Math Core Library 3.1 (compiled with 6.1 and available next week) and our admins have only 6.0 installed. (If it only works with 6.1, I need to push them to upgrade ;-)


While not every major release has compatability issues, compatability problems are generally confined to pgf90 compiled code since optimizations may change how modules are constructed. We did also change the C++ STL between 5.2 and 6.0, but these types of major changes rarely occur. While I'm not 100% sure, I believe that ACML is built with pgf77 so should not have compatiability problems.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group