| View previous topic :: View next topic |
| Author |
Message |
xray
Joined: 21 Jan 2010 Posts: 71
|
Posted: Thu Feb 21, 2013 2:02 am Post subject: Same worksharing type in nested loops - parallel construct |
|
|
Hi,
I can specify a "gang vector" loop schedule for both loop parts within a nested loop while using the kernels construct:
| Code: | #pragma acc kernels
#pragma acc loop gang vector
for( int j = 0; j < n; j++)
{
#pragma acc loop gang vector
for( int i = 0; i < m; i++ ) {...}
} |
Then the compiler uses a 2 dimensional grid and 2 dimensional blocks (that is exactly what I want):
| Code: | 67, #pragma acc loop gang, vector(2) /* blockIdx.y threadIdx.y */
70, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */ |
HOWEVER, if I use the parallel construct instead of kernels, I get an error message and the inner loop schedule will be ignored:
| Code: | PGC-S-0155-Nested loops cannot have the same worksharing type (file.c: 67)
[..]
67, #pragma acc loop gang, vector(256) /* blockIdx.x threadIdx.x */ |
Why do I get this error when it apparently workd nicely (and as expected) with the kernels construct?
How can I get 2 dimensional grids and 2 dimensional blocks with the parallel construct?
Bye, Sandra |
|
| Back to top |
|
 |
xray
Joined: 21 Jan 2010 Posts: 71
|
Posted: Thu Feb 28, 2013 4:14 am Post subject: |
|
|
| Any news? |
|
| Back to top |
|
 |
Michael Wolfe
Joined: 19 Jan 2010 Posts: 36
|
Posted: Thu Feb 28, 2013 3:57 pm Post subject: |
|
|
| Sandra: This is defined behavior for the parallel construct. It's more like the OpenMP loop construct (omp for or omp do). The kernels construct essentially allows tiling. For the parallel construct, we're adding an explicit tile clause for nested loops in the next OpenACC version which should give you the behavior you want. |
|
| Back to top |
|
 |
|