PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Error with 12.3, not with 12.2. Bug?
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
fspiga



Joined: 21 Feb 2012
Posts: 16

PostPosted: Fri Apr 13, 2012 5:57 am    Post subject: Error with 12.3, not with 12.2. Bug? Reply with quote

Dear PGI support,

my company recently bought a PGI license after some internal evaluations. I am happy about the compiler and the overall suite (especially OpenACC!!!). I personally did some tests by compiling my application using PGI 12.2. No problems. But the version we installed after getting the license is 12.3. At run-time, I have a error during I/O operation. GDB reports this:


Quote:
[fspiga@gemini1 PW-AUSURF112]$ gdb ../espresso/bin/pw.x core.44246
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-50.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http>...
Reading symbols from /ichec/home/staff/fspiga/QE/espresso/bin/pw.x...done.
[New Thread 44246]
Missing separate debuginfo for
Try: yum --disablerepo='*' --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/15/aeeb89cdee58e81ee8e0ccc5f7c79dac280dcf
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `../espresso/bin/pw.x -input ausurf_gamma.in'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000351347a7cd in realloc () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6_2.5.x86_64
(gdb) debuginfo-install glibc-2.12-1.47.el6_2.5.x86_64
Undefined command: "debuginfo-install". Try "help".
(gdb) bt
#0 0x000000351347a7cd in realloc () from /lib64/libc.so.6
#1 0x0000000001d8571a in fr_init ()
#2 0x0000000001d81919 in pgf90io_fmtr_init2003 ()
#3 0x0000000000825b26 in iotk_getline_x (unit=4,
line=' ' <repeats>, '\000' <repeats>, '?\004\000\000\000\000\000\000????????', '\000' <repeats>, '?\017\000\000\000\000\000\000<PP_INFO>', ' ' <repeats>, '\000' <repeats>, '?=\016??\177\000\000?\a?\001', '\000' <repeats>, ' E\016??\177\000\000\004J\020??\177\000\000?\"n\002\000\000\000\0000?\016??\177\000\000???\001', '\000' <repeats>, ' E\016??\177\000\000TA\016??\177\000\000?\"n\002', '\000' <repeats>, 'tE\016??\177\000\0000C\016??\177\000\000i??'..., length=2305, ierr=0) at ./iotk_scan.F90:1016
#4 0x0000000000824562 in iotk_scan_tag_x (unit=4, direction=1, control=0, tag=' ' <repeats>, binary=.FALSE., stream=.FALSE., ierr=0) at ./iotk_scan.F90:670
#5 0x0000000000825466 in iotk_scan_x (unit=4, direction=1, control=2, name='PP_INFO\000', ' ' <repeats>, '\000',
attr='\000' <repeats>, 'v\222?\0225', '\000' <repeats>, '?R\017??\177\000\000\000\000\000\000\000\000\000\000v\222?\0225\000\000\000\006\000\000\000\000\000\000\000\020S\017??\177\000\000fII\"\000\000\000\000\020S\017??\177\000\000\006\000\000\000\000\000\000\000H?\022\026?*\000\000o<\224|\000\000\000\000\207\233?\0225', '\000' <repeats>, '<\eD\0235\000\000\000/\000\000\0005\000\000\00003@\0235', '\000' <repeats>, '\220T\017??\177\000\000H<@\0235\000\000\000??@\0235', '\000' <repeats>, '\210\021?\0225\000\000\000H?U\0235\000\000\000\b?'..., binary=.FALSE., stream=.FALSE., found=.FALSE., ierr=0) at ./iotk_scan.F90:898
#6 0x0000000000822ff5 in iotk_scan_end_x (unit=4, name='PP_INFO\000', ' ' <repeats>, dummy=Cannot access memory at address 0x0
) at ./iotk_scan.F90:331
#7 0x000000000081cb6c in iotk_close_read_x (unit=4, dummy=Cannot access memory at address 0x0
) at ./iotk_files.F90:832
#8 0x000000000065b385 in read_upf_v2_module::read_upf_v2 (u=4, upf=Asked for position 0 of stack, stack only has 0 elements on it.
) at ./read_upf_v2.F90:56
#9 0x00000000006930f7 in upf_module::read_upf (upf=Asked for position 0 of stack, stack only has 0 elements on it.
) at ./upf.F90:64
#10 0x000000000064cf0a in read_pseudo_mod::readpp (input_dft='none', ' ' <repeats>, printout=Cannot access memory at address 0x0
) at ./read_pseudo.F90:150
#11 0x0000000000435018 in iosys () at ./input.F90:1267
#12 0x000000000040331a in pwscf () at ./pwscf.F90:53
#13 0x00000000004031f4 in main ()
#14 0x000000351341ecdd in __libc_start_main () from /lib64/libc.so.6
#15 0x00000000004030e9 in _start ()


Into detail, frame #3
Quote:
(gdb) frame 3
#3 0x0000000000825b26 in iotk_getline_x (unit=4,
line=' ' <repeats>, '\000' <repeats>, '?\004\000\000\000\000\000\000????????', '\000' <repeats>, '?\017\000\000\000\000\000\000<PP_INFO>', ' ' <repeats>, '\000' <repeats>, '?=\016??\177\000\000?\a?\001', '\000' <repeats>, ' E\016??\177\000\000\004J\020??\177\000\000?\"n\002\000\000\000\0000?\016??\177\000\000???\001', '\000' <repeats>, ' E\016??\177\000\000TA\016??\177\000\000?\"n\002', '\000' <repeats>, 'tE\016??\177\000\0000C\016??\177\000\000i??'..., length=2305, ierr=0) at ./iotk_scan.F90:1016
1016 read(unit,"(a)",iostat=iostat,eor=1,size=buflen,advance="no") buffer
(gdb) list
1011 logical :: eor
1012 pos = 0
1013 ierrl=0
1014 do
1015 eor = .true.
1016 read(unit,"(a)",iostat=iostat,eor=1,size=buflen,advance="no") buffer
1017 3 continue
1018 eor = .false.
1019 if(iostat/=0) then
1020 call iotk_error_issue(ierrl,"iotk_getline","iotk_scan.f90",964)


My first attempt to understand the problem was
Quote:
(gdb) print unit
$2 = 4


and the PGI Fortran Reference Guide at page 336 reports
Quote:
Logical units 5 (stdin) and 6 (stdout) are line buffered. Logical unit 0 (stderr) is unbuffered. Disk files are fully buffered.


so that 4 should be a 5... maybe... I am trying to figure out where the number "4" comes from.

Using other compilers (like Intel) or, as I said, PGI compiler versions below 12.3, this problem doe not appear.

Do you have any suggestion to solve it?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Fri Apr 13, 2012 8:01 am    Post subject: Reply with quote

Hi fspiga,

Quote:
Using other compilers (like Intel) or, as I said, PGI compiler versions below 12.3, this problem doe not appear.
It's very possible that it's compiler error in 12.3, but it could be a problem with the program. I can't really tell which from the GDB output.

Quote:
Do you have any suggestion to solve it?
First, I'd run the code using the PGI debugger, PGDBG. GDB doesn't understand Fortran so some of the information presented may be misleading.

Also, it looks like you're running PWSCF? Do you know the version? Which workload are you running? If I can recreate the problem here, I'll be able to determine if the problem is with the program or with the compiler.

- Mat
Back to top
View user's profile
fspiga



Joined: 21 Feb 2012
Posts: 16

PostPosted: Fri Apr 13, 2012 8:15 am    Post subject: Reply with quote

Hi mkcolg,


yes it is PWscf (repository version). I am using a very short input test (AUSURF54).

In the code I also tried to replace "unit=4" with "unit=5" or "unit=1234" but the problem persists. I am going to use PGDGB as you suggested to track the error in a more detailed way.

many thanks in advance for your reply.

Cheers,
F.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Fri Apr 13, 2012 12:51 pm    Post subject: Reply with quote

Hi F,

I downloaded espresso 4.3.2 from http://qe-forge.org/frs/?group_id=10 along with the corresponding examples. I then built PW and ran it against the examples. The only failures were due to missing data files.

However, I don't see an input file called "AUSURF54". Can you point me to this input?

Also, which repository version are you using? I see both a GPU enabled branch (espresso-PRACE) and the main trunk.

- Mat
Back to top
View user's profile
fspiga



Joined: 21 Feb 2012
Posts: 16

PostPosted: Sat Apr 14, 2012 10:50 am    Post subject: Reply with quote

Hi Mat,

mkcolg wrote:
Hi F,

I downloaded espresso 4.3.2 from http://qe-forge.org/frs/?group_id=10 along with the corresponding examples. I then built PW and ran it against the examples. The only failures were due to missing data files.

However, I don't see an input file called "AUSURF54". Can you point me to this input?


The input file is here:
http://www.fislab.disco.unimib.it/~filippo/PW-AUSURF54.tar.gz

I am using the code in the repository. You can download it by doing
$ svn checkout svn://scm.qe-forge.org/scmrepos/svn/q-e/trunk/espresso

I tried with PGDGB. I am not expert of this debugger but I think it produces the same errors with the same level of details of GDB. But I am not expert of it. Here the output:

Quote:
pgdbg> debug ../espresso/bin/pw.x -input ausurf_gamma.in
Loaded: /ichec/home/staff/fspiga/QE/PW-AUSURF54/../espresso/bin/pw.x
MAIN_
pgdbg> run
libnuma.so.1 loaded by ld-linux-x86-64.so.2.
libpthread.so.0 loaded by ld-linux-x86-64.so.2.
librt.so.1 loaded by ld-linux-x86-64.so.2.
libm.so.6 loaded by ld-linux-x86-64.so.2.
libc.so.6 loaded by ld-linux-x86-64.so.2.

Program PWSCF v.4.99 starts on 14Apr2012 at 17:26:39

This program is part of the open-source Quantum ESPRESSO suite
for quantum simulation of materials; please cite
"P. Giannozzi et al., J. Phys.:Condens. Matter 21 395502 (2009);
URL http://www.quantum-espresso.org",
in publications or presentations arising from this work. More details at
http://www.quantum-espresso.org/quote.php

Serial multi-threaded version, running on 8 processor cores

Current dimensions of program PWSCF are:
Max number of different atomic species (ntypx) = 10
Max number of k-points (npk) = 40000
Max angular momentum in pseudopotentials (lmaxx) = 3
Reading input from ausurf_gamma.in
Warning: card &IONS ignored
Warning: card ION_DYNAMICS = 'NONE' ignored
Warning: card / ignored
Warning: card &CELL ignored
Warning: card CELL_DYNAMICS = 'NONE' ignored
Warning: card / ignored
Message from routine iosys:
minimal I/O required, wf_collect reset to FALSE
Signalled SIGSEGV at 0x351347A7CD, function __GI___libc_realloc, file interp.c
0x351347A7CD: 48 8B 47 F8 movq -8(%rdi),%rax

pgdbg> stacktrace

STACK TRACE:
#12 pwscf line: "pwscf.F90"@44 address: 0x403483
#11 iosys line: "input.F90"@1255 address: 0x42E72C
#10 readpp line: "read_pseudo.F90"@148 address: 0x5E9143
input_dft = 0x1E3CAF0, ERROR: Cannot read value at address 0x0.
printout = <unavailable>
#9 read_upf line: "upf.F90"@62 address: 0x62969E
upf = 0x356D2B0, grid = 0x356CFF0, ierr = 0, unit = 4, filename = 0x0
#8 read_upf_v2 line: "read_upf_v2.F90"@54 address: 0x5F62E1
u = 4, upf = 0x356D2B0, grid = 0x356CFF0, ierr = 0
#7 iotk_close_read_x line: "iotk_files.F90"@832 address: 0x798A50
unit = 4, dummy = 0x0, ierr = 0
#6 iotk_scan_end_x line: "iotk_scan.F90"@331 address: 0x79ECBB
unit = 4, name = 0x7FFFFE060FA0, dummy = 0x0, ierr = 0
#5 iotk_scan_x line: "iotk_scan.F90"@897 address: 0x7A1455
unit = 4, direction = 1, control = 2, name = 0x7FFFFE050F00, attr = 0x7FFFFE051000, binary = .FALSE., stream = .FALSE., found = .FALSE., ierr = 0
#4 iotk_scan_tag_x line: "iotk_scan.F90"@669 address: 0x7A02EE
unit = 4, direction = 1, control = 538976288, tag = 0x7FFFFE040E60, binary = .FALSE., stream = .FALSE., ierr = 0
#3 iotk_getline_x line: "iotk_scan.F90"@1014 address: 0x7A1B9B
unit = 4, line = 0x7FFFFE03FC80, length = 0, ierr = 0
#2 pgf90io_fmtr_init2003 address: 0x1CFEE39
*** Stack frames number 2 and higher may be incorrect ***
#1 fr_init file: fmtread.c address: 0x1D02C3A
***FP, local variables, and args, for frame numbers 1 and higher may be incorrect ***
=> #0 __GI___libc_realloc file: interp.c address: 0x351347A7CD


(I run the debugger in text mode since it seems I miss a library/program called "xrefresh" in the system).


mkcolg wrote:
Also, which repository version are you using? I see both a GPU enabled branch (espresso-PRACE) and the main trunk.


I am working on the GPU porting of the code in that branch. The branch is compatible 100% with the version 4.3.2 but not with the current trunk (there are not aligned, some differences. We are going to merge them as soon as a new version is finalized). Using the code in the GPU branch, the same problem appears. Older PGI compilers work well, the 12.3 has the same issue...

Many thanks in advance for your support!

Cheers,
F.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group