PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

PGI Accelerated and WRF 3.3 - weird lockup

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
lgrzegorek



Joined: 05 Jun 2011
Posts: 2

PostPosted: Mon Jun 20, 2011 10:43 am    Post subject: PGI Accelerated and WRF 3.3 - weird lockup Reply with quote

Hi all,

I've a problem using the latest WRFv3.3 code compiled with trial version of PGI 11.5 and 11.6 compiler for Tesla C1060 card on board. I'm trying to run a tutorial Jan2000 case (and other real data ones) with no luck. Everything was done step-by-step with WRF guide (netCDF compiled with PGI too). When I run wrf.exe host optimized binary version (empty ACC_DEVICE environment variable) it finishes with "SUCCESS COMPLETE WRF". The problem arises while running accelerated one. It stucks just after output message :

.......
Timing for processing lateral boundary for domain 1: 0.39840 elapsed seconds.
WRF TILE 1 IS 1 IE 20 JS 1 JE 20
WRF NUMBER OF TILES = 1

Here I can see PGI message (triggered by setting ACC_NOTIFY to 1) saying that acc kernel has been entered (wsm32D function, line 211). Then the wrf.exe process consumes 100% CPU and GPU time (returned by nvidia-smi utility) and no more happens. Running wrf.exe with strace returns sequential ioctl() calls. After a bit of nvidia kernel module debugging it turned out that they were related to rm_ioctl().

It's worth to say that PGI Fortran & C examples work fine.

All of the above have been tested on two GNU/Linux distros : Debian 6.0.1a and Fedora 13 (both x86_64) with plenty of Nvidia kernel driver versions from 190.53 upward.

Have you ever encountered similar problems or have any idea how to deal with it ? If it's needed I can post some more info, command output, etc. just drop me a note.

Best regards
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6211
Location: The Portland Group Inc.

PostPosted: Tue Jun 21, 2011 4:22 pm    Post subject: Reply with quote

Hi lgrzegorek,

You're encountering a known bug in the accelerator compiler logged as TPR #17932. We do apologize for this error and are currently working on a fix. Expect the fix to be available in the 11.7 release.

Best Regards,
Mat
Back to top
View user's profile
lgrzegorek



Joined: 05 Jun 2011
Posts: 2

PostPosted: Wed Jun 22, 2011 12:07 am    Post subject: Reply with quote

Thank you Mat. Good to hear that.

Best regards
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group