PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Parallize without manuall inlining
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
rztim



Joined: 12 Jul 2010
Posts: 12

PostPosted: Wed Aug 11, 2010 9:16 am    Post subject: Parallize without manuall inlining Reply with quote

Hello

i wonder, if there is a possibilty for accellerating a function without inlining the the nested function calls. I know function calls within a parallel compute region are not allowed, but i think they are allowed within a data region. For example (pseudo code):

void foo(double* x, double y*){
double *a, *b, *c;
a=(double*)malloc(n*sizeof(double));
b=(double*)malloc(n*sizeof(double));

parFoo1(x,y,c);
for (i = 0; i<M; i++){
parFoo2(a, x);
parFoo2(b, x);
parFoo1(a,b,y);
}
}

Where parFoo* are functions with corresponding acc pragmas. The problem here is, that the data for the arrays a,b,x is copied for every call in each iteration, although the data is only needed in the device for the whole loop. So my idea was to define a data region like this in foo():

void foo(){
#pragma acc data region copyin(x), copyout(y), local(a,b)
{
... the code ...
}
}

Unfortunally my approach did not work, because the compiler still copies the the arrays within the accelerated functions parFoo*(), i get a feedback like this:

parFoo1:
60, Generating copyin(x[0:n-1])
Generating compute capability 1.3 kernel
62, Loop is parallelizable
Accelerator kernel generated

My question is, if there is a possibilty to define a data region and avoid copy the needed arry within the nested function call. I guess the parFoo*() function needs to know that the parameter is already a "device pointer". Thanks for your hints. Could I use the "local" clause to realize this somehow?

Kind regards,

Tim
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Wed Aug 11, 2010 4:40 pm    Post subject: Reply with quote

Hi Tim,

If this were Fortran, I say to take a look at the 'reflected' directive. While it won't be available til the 11.0 release, 'reflected' allows you to do exactly this.

Unfortunately, you're using C where there isn't a way to pass information about 'x' from one routine to another (specifically if it's been malloc'd on the GPU). So unfortunately, you'd need to manually inline parFoo1 and parFoo2.

Adding 'reflected' to C, is a long term goal. However, given the limit's of the language, it may be awhile.

Sorry,
Mat
Back to top
View user's profile
rztim



Joined: 12 Jul 2010
Posts: 12

PostPosted: Thu Aug 12, 2010 3:19 am    Post subject: Reply with quote

Hi Mat,

thanks again for your help. From my point of view the missing possibilty of sharing data within a nested function call is a big limitation of the programming model. For bigger codes it might be a lot of work to inline the functions and it makes the code less human readable, especially if you try to implement a lot of different numerical technics or algorithms, which all use the same kernel functions (e.g. a matrix vector product).

Could it not be possible to use the 'inline' keyword for avoiding the additional copy of the array? So the compiler could try to inline this function and could realize that the pointer given as parameter is already shared in the data region? The size of the array is spezified in the pragma data region, so no additional information are need to pass to the function.

Cheers,
Tim
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Thu Aug 12, 2010 3:35 pm    Post subject: Reply with quote

Quote:
From my point of view the missing possibilty of sharing data within a nested function call is a big limitation of the programming model.
We understand and agree. All hope is not lost since we can usually find solutions to difficult problems, this one is just particularly difficult given the confines of the language.

Quote:
Could it not be possible to use the 'inline' keyword for avoiding the additional copy of the array?
It's possible and one that we're investigating.

- Mat
Back to top
View user's profile
rztim



Joined: 12 Jul 2010
Posts: 12

PostPosted: Mon Jul 11, 2011 9:20 am    Post subject: Reply with quote

mkcolg wrote:
It's possible and one that we're investigating.


Ok. It's an old thread, but i still trying this method. As far as I understood with 11.4 the reflected clause is implemented in C. But for me this does not work for a code like this:

Code:

void foo(double* resticted x, double* restricted y){
#pragma acc reflected(x,y)
#pragma acc region for
for(...)
...do some work...

}

void main(){
a=(double*)malloc(n*sizeof(double));
b=(double*)malloc(n*sizeof(double));
... assign data...
#pragma acc data region copyin(a, b)
{
foo(a,b)
}
}


Can I avoid that the data is copyed in foo() again with this method? Should it work with 11.4? I get an error like this:

Code:

pgcc -fastsse -DDEBUG -ta=nvidia,cc20 -Minfo -g -c solver.c
PGC-S-0035-Syntax error: Recovery attempted by replacing identifier reflected by keyword cache (solver.c: 10)
PGC-S-0036-Syntax error: Recovery attempted by inserting <nl> before acc (solver.c: 11)
PGC-S-0037-Syntax error: Recovery attempted by deleting identifier region (solver.c: 11)
PGC-S-0036-Syntax error: Recovery attempted by inserting <nl> before keyword for (solver.c: 13)
PGC-W-0155-Long value is passed to a nonprototyped function - argument #3 (solver.c: 162)
PGC/x86-64 Linux 11.4-0: compilation completed with severe errors


Sounds that the reflected Keyword is not known or why trys the compiler to replace it with cache?
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group