PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

OpenACC nested c structs

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
AdamSimpson



Joined: 08 Feb 2011
Posts: 9

PostPosted: Mon Mar 18, 2013 5:17 am    Post subject: OpenACC nested c structs Reply with quote

Hi,
I have run into a problem with OpenACC with pgi/12.10.0, it does not seem to allow c structs which contain more than one struct member. For example the following code will not parallelize, giving an invalid loop error in the -Minfo output. If I however only have one double3 struct inside the st struct it seems to parallelize. Is this something that has been fixed in a newer version?

Thanks,
Adam

Code:

typedef struct ST st;
typedef struct TUPLE double3;

#include <stdio.h>
#include <stdlib.h>

struct TUPLE {
    double x;
    double y;
    double z;
};

struct ST {
    double3 pos;
    double3 vel;
};

int main(int argc, char *argv[])
{
    st *mystructs = (st*) malloc(sizeof(st)*1000);

    int i;
    st tmpST;
    #pragma acc kernels copy(mystructs[0:1000])
    for(i=0; i<1000; i++){
        tmpST = mystructs[i];
    }

    return 0;
}
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 4996
Location: The Portland Group Inc.

PostPosted: Mon Mar 18, 2013 9:59 am    Post subject: Reply with quote

Hi Adam,

The problem is with the implicit deep copy between the two structs. If you change this to an explicit copy, then your should be fine.

Code:
% cat struct.c
typedef struct ST st;
typedef struct TUPLE double3;

#include <stdio.h>
#include <stdlib.h>

struct TUPLE {
    double x;
    double y;
    double z;
};

struct ST {
    double3 pos;
    double3 vel;
};

int main(int argc, char *argv[])
{
    st *mystructs = (st*) malloc(sizeof(st)*1000);

    int i;
    st tmpST;
    #pragma acc kernels copy(mystructs[0:1000])
    for(i=0; i<1000; i++){
        tmpST.pos.x = mystructs[i].pos.x;
        tmpST.pos.y = mystructs[i].pos.y;
        tmpST.pos.z = mystructs[i].pos.z;
        tmpST.vel.x = mystructs[i].vel.x;
        tmpST.vel.y = mystructs[i].vel.y;
        tmpST.vel.z = mystructs[i].vel.z;
    }

    return 0;
}
% pgcc -acc -Minfo=accel struct.c
main:
     24, Generating present_or_copy(mystructs[0:1000])
         Generating NVIDIA code
         Generating compute capability 1.0 binary
         Generating compute capability 2.0 binary
         Generating compute capability 3.0 binary
     25, Loop is parallelizable
         Accelerator kernel generated
         25, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */


Hope this helps,
Mat
Back to top
View user's profile
AdamSimpson



Joined: 08 Feb 2011
Posts: 9

PostPosted: Mon Mar 18, 2013 12:07 pm    Post subject: Reply with quote

Thanks Mat, that works.
Back to top
View user's profile
AdamSimpson



Joined: 08 Feb 2011
Posts: 9

PostPosted: Tue Mar 19, 2013 5:49 am    Post subject: Reply with quote

Mat,
Just so i'm clear, is this considered to be a bug that may be fixed in the future?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 4996
Location: The Portland Group Inc.

PostPosted: Wed Mar 20, 2013 9:29 am    Post subject: Reply with quote

No, it's more of a limitation. Though, I put in a feature request (TPR#19218) and we'll see what can be done. Your case is simpler then others I've seen so may be possible.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2002 phpBB Group