PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Avoid reallocating memory on the GPU.
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
JPMN



Joined: 23 Oct 2012
Posts: 8

PostPosted: Sun Nov 04, 2012 7:33 am    Post subject: Avoid reallocating memory on the GPU. Reply with quote

Hello.

I'm trying to parallelize a program using OpenACC (pgcc).

My program calls a certain function (let's call it function1()) a lot of times. This function allways uses the same ammount of memory each time it's called.

Assume tha function1() is something like:

Code:
void function1(){

#pragma acc data copyin(ArraysIn[sizeOfArrays]), copyout(ArrayOut[sizeOfArray])

#pragma acc kernels
for(){
    #pragma acc kernels
    for(){
        #pragma acc kernels
        for(){
            /*Do some work here*/
        }
    }
}
}


Since this function is called several times, each time I call it I'm allocating the arrays I need on the GPU. I want to allocate these arrays only once and reuse them so that I don't need to keep spending time allocating again.

How can I do this, having in mind that the compiler can't determine the size of the arrays automatically?

Will something like this work?
Code:
main{
#pragma acc data create(ArraysIn[sizeOfArrays], ArrayOut[sizeOfArray])

for(){
function1();
}
}


And then change function1() to:

Code:
void function1(){

#pragma acc update device(ArraysIn[sizeOfArrays])

#pragma acc kernels
for(){
    #pragma acc kernels
    for(){
        #pragma acc kernels
        for(){
            /*Do some work here*/
        }
    }
}

#pragma acc update host(copyout(ArrayOut[sizeOfArray]))
}


Will this work? Will it work even if this call (#pragma acc data create(ArraysIn[sizeOfArrays], ArrayOut[sizeOfArray])) is made from some other function call before function1()?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6215
Location: The Portland Group Inc.

PostPosted: Mon Nov 05, 2012 4:34 pm    Post subject: Reply with quote

Quote:
Will this work?
This is the right idea. You just need to add a data region in function1 telling the runtime to go look-up where the array is on the device (via the "present" clause).

You can use the "update" in combination with the "present" clause. (Note that I made a few other small corrections)
Code:
void function1(ArraysIn, ArraysOut){

#pragma acc data present(ArraysIn[0:sizeOfArrays],ArraysOut[0:sizeOfArrays])
{
#pragma acc update device(ArraysIn[0:sizeOfArrays])

#pragma acc kernels
for(){
    for(){
        for(){
            /*Do some work here*/
        }
    }
}

#pragma acc update host(copyout(ArrayOut[0:sizeOfArray]))
}
}


You use "pcreate" (present or create) instead of "present" if you don't know if this routine is called from within another data region. With just "present", you'd get a runtime error if the data isn't already on the device. With "pcreate", the arrays would be created if not found.

Hope this helps,
Mat
Back to top
View user's profile
JPMN



Joined: 23 Oct 2012
Posts: 8

PostPosted: Mon Jan 21, 2013 6:15 am    Post subject: Reply with quote

I'm sorry to bother you again with the same topic, but it took me some time to be able to actually test this and this doesn't seem to work at all.

This compiles correctly but when I try to run it, I get the following:
Code:
FATAL ERROR: data in PRESENT clause was not found: name=clP


clP is one of the arrays I need to allocate on the GPU.

The problem here seems to be that the main function is in one .c file and function1 is in another .c file. Also, I don't know if this has anything to do with it or not but function1 is not called exactly on the main function. Main calls functionA, funcitonA calls functionB and functionB calls function1.

This looks something like this:
Code:

void funtionB(){function1();}

void functionA(){functionB();}

int main(int argc, char* argv[]){
for(){funtionA();}
}


And function1 is what I mentioned previously.

What am I doing wrong here?

If needed I can post the actual code for these functions.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6215
Location: The Portland Group Inc.

PostPosted: Mon Jan 21, 2013 8:04 am    Post subject: Reply with quote

Hi JPMN,

You just need to add back in the outer data region surrounding the for loop in the main routine to have the arrays copied over to the device before "present" looks them up.

- Mat
Back to top
View user's profile
JPMN



Joined: 23 Oct 2012
Posts: 8

PostPosted: Mon Jan 21, 2013 8:23 am    Post subject: Reply with quote

Hi.

First and foremost thank you for the quick reply :).

I'm not sure if I understant your answer though. There was a typo on the main function I presented last post, the actual function is:
Code:
int main(int argc, char* argv[]){

#pragma acc data create(tiPL[16*numGammaCats], clL[4*numChars*numGammaCats], tiPR[16*numGammaCats], clR[4*numChars*numGammaCats], clP[4*numChars*numGammaCats], tiPA[16*numGammaCats], clA[4*numChars*numGammaCats], clP[4*numChars*numGammaCats])
{
for(){funtionA();}
}
}


And then, on function1 I have the present clause followed by an update device clause to transfer the data into these arrays.

Are you saying that I need to have the data copied into the device before I can do the present clause? If that's the case that would ruin this code because function1 always uses the same type of arrays with the same size but the data on the arrays are different from function call to function call.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group