PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

strange error

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling
View previous topic :: View next topic  
Author Message
zbeekman



Joined: 04 Apr 2012
Posts: 5

PostPosted: Wed Apr 04, 2012 12:01 pm    Post subject: strange error Reply with quote

Is it possible that the compiler,

Code:
$ pgf90 -V

pgf90 10.9-0 64-bit target on x86-64 Linux -tp istanbul-64
Copyright 1989-2000, The Portland Group, Inc.  All Rights Reserved.
Copyright 2000-2010, STMicroelectronics, Inc.  All Rights Reserved.


is performing asynchronous IO or auto-parallelization without being asked to? I am seeing a non-deterministic insufficient memory error without any backtrace or line number info, which seems to go away when I add some print statements.

Code:
0: ALLOCATE: 18446744073709551615 bytes requested; not enough memory


I read an input file from std in and then perform some memory allocation, and this is the step where I think I get the error. Also, the system has plenty of memory to handle what I request in the input file. sometimes this error appears and other times it doesn't. I am definitely not requesting that much memory in my code. The array bounds are read from the input file and I'm wondering if maybe somehow they are not being flushed to memory before the allocation is performed. I show the compile command below.

Also, is there a list of supported F2003 features for this compiler release? I think the only ones I am using are ALLOCATABLE dummy arguments, STOP statements with INTEGER PARAMETER constants, USE ISO_FORTRAN_ENV, ONLY: output_unit and POINTER, vector, derived TYPE components. I want to make sure that this is not the source of the problem.

The code was compiled thusly (I also tried to throw every warning or runtime check I could think of at it to figure out what was going on):

Code:
pgf90  -Minfo=all -Mrecursive -Ktrap=align,divz -C -Mchkfpstk -Mchkptr -Mchkstk -Mstandard -Mallocatable=95 -Minform=inform -g -traceback -I include  -c modinput.f90
nelems:
    258, any reduction inlined
    259, any reduction inlined
pgf90  -Minfo=all -Mrecursive -Ktrap=align,divz -C -Mchkfpstk -Mchkptr -Mchkstk -Mstandard -Mallocatable=95 -Minform=inform -g -traceback -I include  -c extractplanes.f90
PGF90-I-0035-Predefined intrinsic time loses intrinsic property (extractplanes.f90: 41)
pgf90  -Minfo=all -Mrecursive -Ktrap=align,divz -C -Mchkfpstk -Mchkptr -Mchkstk -Mstandard -Mallocatable=95 -Minform=inform -g -traceback -I include   -o extractplanes extractplanes.o modinput.o


My code for reading the input file is somewhat complicated, and I don't want to post it, and I have not yet been able to create a smaller example to reproduce the issue. Despite this any insight anyone has is much appreciated.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Wed Apr 04, 2012 2:04 pm    Post subject: Reply with quote

Hi zbeekman,

Quote:
0: ALLOCATE: 18446744073709551615 bytes requested; not enough memory
In looking through our problem reports I do see one similar open report (TPR#17711) where the compiler is generating a bad size value of the second dimension an automatic array when the size is calculated using the "SIZE" intrinsic on a dummy argument. Though, in this case, the error only occurs with optimisation (-O2). When compiled with -g, the program still fails, but with a segv not the "not enough memory". So it's unclear if your issue is related.

If you can, please send the code to PGI Customer Service (trs@pgroup.com) and ask them to forward it to me. I should be able to tell if it's the same as TPR#17711, a new compiler error, or a problem in your code.

Quote:

Also, is there a list of supported F2003 features for this compiler release?
They are listed in Chapter 2 of the release notes. They are listed by the release in which they they were added so you'll need to look in a couple of spots. 10.9 basically supported ~75% of the standard with 11.0 filling out most of the rest of the features. The exceptions to this were Sourced allocation (11.6), Final Procedures (11.7), and recursive I/O (12.1).

- Mat
Back to top
View user's profile
zbeekman



Joined: 04 Apr 2012
Posts: 5

PostPosted: Thu Apr 05, 2012 8:55 am    Post subject: Reply with quote

Yeah, the strange thing is it's not deterministic whether it hits the error. Adding some WRITE(output_unit,*) statements seems to always fix it, but WRITE(error_unit,*). Also, under Totalview it stops running at a strange place

I'll send along the code, but it's still under development. The KVP input library passes all the unit tests I have written (under ifort though). The other code is fairly straight forwardI need to clean up the name space a bit, and possibly split it into multiple libraries, say a dictionary object module, a parsing module and a kvp put/get module... but anyway I'll send it along.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Thu Apr 05, 2012 2:23 pm    Post subject: Reply with quote

Quote:
Yeah, the strange thing is it's not deterministic whether it hits the error. Adding some WRITE(output_unit,*) statements seems to always fix it, but WRITE(error_unit,*). Also, under Totalview it stops running at a strange place
Sounds like a UMR (uninitiated memory read). Once I get the code, I'll try running it under Valgrind.

Quote:
I'll send along the code, but it's still under development.
No worries. As long as I can reproduce the error.

- Mat
Back to top
View user's profile
zbeekman



Joined: 04 Apr 2012
Posts: 5

PostPosted: Fri Apr 06, 2012 9:15 am    Post subject: Reply with quote

I localized the error (using valgrind) and it turns out that
Code:
INQUIRE(input_unit, recl=rlen)
will give different results between compilers/machines. Under ifort on our in house cluster it returns the length (in characters) of the input stream on stdin. I'm not sure what the standard specifies about this--probably not much--and clearly using it in this manner is a bad idea. This resulted in indexing into a character variable like this:
Code:
foo(:-1)
which then caused all sorts of crazy things to happen.

At one point I was optimistic and thought that it would be nice to use allocatable scalar character components in my dictionary data structure. After discovering bugs in F03/08 features in the non PGI compilers I was using and looking at the compilers on the off site HPC resources we use I decided a more portable solution was to stick with static length character variables for the time being. However, this segment of code was not updated to reflect the static character variable allocation and was probably a hack/workaround for a compiler bug I was seeing (again non PGI compiler).

Thanks for offering to take a look at this and sorry I cried wolf.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group