PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

kernel not generated

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
AdamSimpson



Joined: 08 Feb 2011
Posts: 12

PostPosted: Sat Mar 05, 2011 1:22 pm    Post subject: kernel not generated Reply with quote

Hi, I am using 10.9 but unable to generate a kernel for the following code.

Code:
#pragma acc region copyout(grid[0:pixelsX*pixelsY-1])
{
    for(pixelY=0; pixelY<pixelsY; pixelY++) {
        for(pixelX=0; pixelX<pixelsX; pixelX++) {
             y = maxY - pixelY*spacing;
             x = minX + pixelX*spacing;
             iterations = 0;

             x0 = x;
             y0 = y;

           while(x*x + y*y <= 4 && iterations < maxIterations) {
                xtmp = x*x - y*y + x0;
                y = 2*x*y + y0;
                x = xtmp;

                iterations++;
            }

            //Write number of iterations required to array
            grid[pixelY*pixelsX + pixelX] = iterations;
        }
    }
}


When I try and compile I get the following:

Code:
   
    39, Accelerator region ignored
     50, Accelerator restriction: induction variable live-out from loop: iterations
         Accelerator restriction: loop has multiple exits
         Accelerator restriction: loop contains induction variable live after the loop
     55, Accelerator restriction: induction variable live-out from loop: iterations


I want everything inside of the 2 outermost for loops to be the kernel but the compiler doesn't seem to like that. I have tried using #pragma acc for kernel but that hasn't helped either. Can anyone shed some light on how to get this kernel generated? Thanks!

the entire code is located below:


Code:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(int argc, char *argv[])
{
    int pixelX, pixelY;

    //Dimensions of grid
    const double minX = -1.8;
    const double maxX = 0.7;
    const double minY = -1.2;
    const double maxY = 1.2;

    //spacing between grid points
    const double spacing = 0.005;

    //Number of grid points in x and y direction
    const int pixelsX = (int)ceil(fabs(maxX-minX)/spacing);
    const int pixelsY = (int)ceil(fabs(maxY-minY)/spacing);

    printf("%d x %d grid size\n", pixelsX, pixelsY);

    size_t bytes = pixelsX*pixelsY*sizeof(int);

    //Allocate grid to hold number of itereations for escape
    int *grid = (int*)malloc(bytes);

    //Maximum iterations to test for escape
    const int maxIterations = 255;

    int iterations = 0;
    double x = 0;
    double y = 0;
    double xtmp = 0;
    double x0;
    double y0;

#pragma acc region copyout(grid[0:pixelsX*pixelsY-1])
{
    for(pixelY=0; pixelY<pixelsY; pixelY++) {
        for(pixelX=0; pixelX<pixelsX; pixelX++) {

             y = maxY - pixelY*spacing;
             x = minX + pixelX*spacing;
             iterations = 0;

             x0 = x;
             y0 = y;
         
           while(x*x + y*y <= 4 && iterations < maxIterations) {
                xtmp = x*x - y*y + x0;
                y = 2*x*y + y0;
                x = xtmp;

                iterations++;
            }

            //Write number of iterations required to array
            grid[pixelY*pixelsX + pixelX] = iterations;
        }
    }
}

return 0;
}
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Mon Mar 07, 2011 11:54 am    Post subject: Reply with quote

Hi AdamSimpson,

It's the real comparison in the while loop that's causing the problem. I've modified the code below to get it to accelerate. Note that the 'independent' clause are necessary for grid's computed index.

Hope this helps,
Mat

Code:
#pragma acc region copyout(grid[0:pixelsX*pixelsY-1])
{
#pragma acc for independent
    for(pixelY=0; pixelY<pixelsY; pixelY++) {
#pragma acc for independent, private(iterations)
        for(pixelX=0; pixelX<pixelsX; pixelX++) {

             y = maxY - pixelY*spacing;
             x = minX + pixelX*spacing;
             iterations = 0;

             x0 = x;
             y0 = y;
           //while(x*x + y*y <= 4 && iterations < maxIterations) {
           while(iterations < maxIterations) {
              if (x*x + y*y <= 4) {
                break;
              } else {
                xtmp = x*x - y*y + x0;
                y = 2*x*y + y0;
                x = xtmp;

                iterations++;
              }
             }
            //Write number of iterations required to array
            grid[pixelY*pixelsX + pixelX] = iterations;
        }
    }
}
Back to top
View user's profile
AdamSimpson



Joined: 08 Feb 2011
Posts: 12

PostPosted: Mon Mar 07, 2011 6:19 pm    Post subject: Reply with quote

Thanks Mat,
That does compile but does not produce the correct result. The below is the smallest example case I could make and when compiled with -ta=host I get 9 for all of the values in grid as expected, when compiled for -ta=nvidia I get 0 for all the values. any ideas? Is it a problem with version 10.9? Thanks

Code:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(int argc, char *argv[])
{
    //Number of grid points in x and y direction
    const int pixelsX = 10;
    const int pixelsY = 10;

    size_t bytes = pixelsX*pixelsY*sizeof(int);

    //Allocate grid to hold number of itereations for escape
    int *grid = (int*)malloc(bytes);

    int iterations;
    int pixelY, pixelX;

#pragma acc region copyout(grid[0:pixelsX*pixelsY-1])
{
    #pragma acc for independent
    for(pixelY=0; pixelY<pixelsY; pixelY++) {
        #pragma acc for independent, private(iterations)
        for(pixelX=0; pixelX<pixelsX; pixelX++) {

             iterations = 0;
             while(iterations < 100)
             {
                 if(pixelX > 8) {
                     break;
                 }
                 else {
                     iterations++;
                 }
             }

            //Write number of iterations required to array
            grid[pixelY*pixelsX + pixelX] = iterations;
        }
    }
}
    for(pixelY=0; pixelY<pixelsY; pixelY++){
        for(pixelX=0; pixelX<pixelsX; pixelX++) {
            printf("%d\n",grid[pixelY*pixelsX + pixelX]);
       }
    }

    return 0;
}
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Tue Mar 08, 2011 11:57 am    Post subject: Reply with quote

Quote:
Is it a problem with version 10.9?
Looks like it. The good news that it was resolved in PGI 2011. Most likely related to a problem that I found internally in a similar Fortran code.

- Mat

Code:
AdamSimpson% pgcc -ta=nvidia,keepgpu -fast testACC2.c -Minfo -V11.2 ; a.out
main:
     19, Generating copyout(grid[:pixelsY*pixelsX-1])
         Generating compute capability 1.0 binary
         Generating compute capability 1.3 binary
         Generating compute capability 2.0 binary
     22, Loop is parallelizable
     24, Loop is parallelizable
         Accelerator kernel generated
         22, #pragma acc for parallel, vector(10) /* blockIdx.y threadIdx.y */
         24, #pragma acc for parallel, vector(10) /* blockIdx.x threadIdx.x */
             CC 1.0 : 7 registers; 48 shared, 12 constant, 0 local memory bytes; 100% occupancy
             CC 1.3 : 7 registers; 48 shared, 12 constant, 0 local memory bytes; 100% occupancy
             CC 2.0 : 11 registers; 8 shared, 56 constant, 0 local memory bytes; 66% occupancy
     43, Loop not vectorized/parallelized: contains call
0 0 100
0 1 100
0 2 100
0 3 100
0 4 100
0 5 100
0 6 100
0 7 100
0 8 100
0 9 0
1 0 100
1 1 100
1 2 100
1 3 100
1 4 100
1 5 100
1 6 100
1 7 100
1 8 100
1 9 0
2 0 100
2 1 100
2 2 100
2 3 100
2 4 100
2 5 100
2 6 100
2 7 100
2 8 100
2 9 0
3 0 100
3 1 100
3 2 100
3 3 100
3 4 100
3 5 100
3 6 100
3 7 100
3 8 100
3 9 0
4 0 100
4 1 100
4 2 100
4 3 100
4 4 100
4 5 100
4 6 100
4 7 100
4 8 100
4 9 0
5 0 100
5 1 100
5 2 100
5 3 100
5 4 100
5 5 100
5 6 100
5 7 100
5 8 100
5 9 0
6 0 100
6 1 100
6 2 100
6 3 100
6 4 100
6 5 100
6 6 100
6 7 100
6 8 100
6 9 0
7 0 100
7 1 100
7 2 100
7 3 100
7 4 100
7 5 100
7 6 100
7 7 100
7 8 100
7 9 0
8 0 100
8 1 100
8 2 100
8 3 100
8 4 100
8 5 100
8 6 100
8 7 100
8 8 100
8 9 0
9 0 100
9 1 100
9 2 100
9 3 100
9 4 100
9 5 100
9 6 100
9 7 100
9 8 100
9 9 0
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group