PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Parallel construct reductions

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
nickaj



Joined: 06 Sep 2011
Posts: 21

PostPosted: Tue Aug 28, 2012 3:28 am    Post subject: Parallel construct reductions Reply with quote

Here's an interesting one.
First, the code. It's from a bigger code hence the line numbers not making sense. 460 corresponds to the data copyin line.
I'm using PGCC v12.8.

Code:

#pragma acc data copyin(a[0:n])
{

#pragma acc parallel loop reduction(+:z)
    for (i=0;i<n;i++){
        z += a[i];
    }
}


And the compilation output:
Code:

    460, Generating copyin(a[0:n])
    464, Accelerator kernel generated
        464, CC 1.0 : 7 registers; 48 shared, 32 constant, 0 local memory bytes
             CC 2.0 : 12 registers; 0 shared, 72 constant, 0 local memory bytes
        465, #pragma acc loop gang, vector(256) /* blockIdx.x threadIdx.x */
    464, Generating copyin(a[0:n])
         Generating compute capability 1.0 binary
         Generating compute capability 2.0 binary


So, some questions:
1. Could the compiler output be changed so that it tells me it's really done the reduction? If I omit the reduction clause, the compiler realises I wanted a reduction and inserts one for me AND tells me it's happened. I presume it's working as I get the following from the runtime diagnostic

Code:

    464: region entered 10 times
        time(us): total=4,798 init=3 region=4,795
                  kernels=1,525
        w/o init: total=4,795 max=1,253 min=391 avg=479
        464: kernel launched 10 times
            grid: [4096]  block: [256]
            time(us): total=1,395 max=143 min=138 avg=139
        465: kernel launched 10 times
            grid: [1]  block: [256]
            time(us): total=130 max=13 min=13 avg=13


I presume this means that the kernel launched at 465 is really a reduction function/kernel inserted by the compiler.

2. Can I disable the automatic reduction detection? Sometimes, when I want to show someone the reduction clause working, it'd be nice to be able to show it not working and how this might cause the wrong answer due to overwrites of z etc (or z being automatically privatized).

3. Looking at the runtime output, I can see only one copyin is done:
Code:

    460: region entered 10 times
        time(us): total=13,067 init=3 region=13,064
                  data=6,335
        w/o init: total=13,064 max=3,933 min=1,009 avg=1,306


But this is confusing as the compiler says it does a copyin at 464 as well as at 460. Is this just a compiler output bug?

Cheers,
-Nick.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Tue Aug 28, 2012 8:38 am    Post subject: Reply with quote

Hi Nick,

Quote:
I presume this means that the kernel launched at 465 is really a reduction function/kernel inserted by the compiler.
That seems correct. Though, I think you're right in that an informational message should be emitted when a reduction is generated from a reduction clause, not just when the compiler auto-detects one. I added TPR#18894 to track this.

Quote:
Can I disable the automatic reduction detection?
No because without the reduction code, the loop is not parallel and either no or sequential kernel would be generated.

Quote:
But this is confusing as the compiler says it does a copyin at 464 as well as at 460. Is this just a compiler output bug?
The second one is actually a "present". The compiler will do a runtime check to make sure the "a" in the kernel is the same as the "a" in the data copy. By doing this, we can support pointer swapping. Though, yes, the output is confusing and I have an open issue (TPR#18858) requesting that our engineers make this more clear.

Best Regards,
Mat


Last edited by mkcolg on Tue Aug 28, 2012 8:58 am; edited 1 time in total
Back to top
View user's profile
nickaj



Joined: 06 Sep 2011
Posts: 21

PostPosted: Tue Aug 28, 2012 8:57 am    Post subject: Reply with quote

Thanks Mat.

-Nick.
Back to top
View user's profile
jtull



Joined: 30 Jun 2004
Posts: 395

PostPosted: Fri Jan 24, 2014 6:06 pm    Post subject: TPR 18894 - ACC User would like Minfo message when reduction Reply with quote

Nick,

We have added more information to -Minfo, including when reductions
are performed in ACC codes. See our 14.1 release.

thanks,
dave
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group