PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Wrong results: 12.5 vs 12.6
Goto page Previous  1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
PaulPa



Joined: 02 Aug 2012
Posts: 35

PostPosted: Wed Aug 15, 2012 1:48 am    Post subject: Reply with quote

Hi guys,

sorry for the late response.

Mat, would it be possible to send you the source code via mail?
As I said, it is exactly the same source code and I don't find any errors.

Best,
Paul
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5871
Location: The Portland Group Inc.

PostPosted: Wed Aug 15, 2012 8:44 am    Post subject: Reply with quote

Hi Paul,

Quote:
Mat, would it be possible to send you the source code via mail?
Yes. Please send it to PGI Customer Service (trs@pgroup.com) and ask them to forward it to me. If it's a compiler bug, I'll triage it, submit a report, and hopefully find a work around.

- Mat
Back to top
View user's profile
PaulPa



Joined: 02 Aug 2012
Posts: 35

PostPosted: Sun Sep 30, 2012 10:50 am    Post subject: Reply with quote

Hi guys,

I just wanted to let you know that the exact same source code is working with
PGI compiler 12.9 again. (12.8 was not working as well).

I realized that 12.9 schedules the work as follows:

Code:

121, Loop is parallelizable
         Accelerator kernel generated
        121, #pragma acc loop gang /* blockIdx.x */
             CC 2.0 : 27 registers; 32 shared, 136 constant, 0 local memory bytes
        131, #pragma acc loop vector(256) /* threadIdx.x */


while 12.8 does the following:
Code:

121, #pragma acc loop gang /* blockIdx.x threadIdx.x */
             Cached references to size [(x)] block of 'bhat'
             CC 2.0 : 27 registers; 32 shared, 136 constant, 0 local memory bytes
        131, #pragma acc loop vector(32) /* threadIdx.y */


These two look very similar, but 12.8 reports something about threadIdx.x in line 121. This is kind of strange since the feedback doesn't say anything about vector in this line (different from 12.9).

What does the /* ... */ part stand for anyway?

@Mat: This is the same version I filed a bug-report for earlier.

Thank you.

Best,
Paul
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5871
Location: The Portland Group Inc.

PostPosted: Mon Oct 01, 2012 9:03 am    Post subject: Reply with quote

Hi Paul,

Yes, TPR#18913 was listed as fixed in 12.9.

Quote:
What does the /* ... */ part stand for anyway?
It's informational about the correspondence between the OpenACC schedule and the target device schedule. For NVIDIA CUDA, a "gang" corresponds to a "block" and "vector" to "thread". The ".x", ".y", and ".z" are the dimensions.

- Mat
Back to top
View user's profile
PaulPa



Joined: 02 Aug 2012
Posts: 35

PostPosted: Tue Oct 02, 2012 2:13 am    Post subject: Reply with quote

Hi Mat,

so why does it say:
Code:
#pragma acc loop gang /* blockIdx.x threadIdx.x */


I figured that this should be blockIdx only.

Best,
Paul
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2, 3  Next
Page 2 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group