PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

PGI accelerator model with OpenMP/MPI
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
tannguyen



Joined: 26 Jul 2010
Posts: 11

PostPosted: Wed Sep 08, 2010 12:52 pm    Post subject: PGI accelerator model with OpenMP/MPI Reply with quote

Hi, I am testing how PGI accelerator model works with OpenMP and MPI. I realize that we have to specify statically the number of threads/processes in the code. Here is an example with OpenMP:

Num_GPUs=2;
#pragma omp parallel num_threads(2)
{
acc_set_device_num(omp_get_thread_num()%Num_GPUs, acc_device_default);
low= omp_get_thread_num()*N/2;
high = low + N/2;
#pragma acc region
{
for (i = low; i < high; i++) {...}
}

}

Could someone tell me how to do the similar thing with MPI? The following code does not work.

MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int Numprocs =2;
Num_GPUs=2;
#pragma acc region
{
acc_set_device_num(rank%Num_GPUs, acc_device_default);
low = rank*(N / Numprocs );
high = low + N/Numprocs ;
#pragma acc region
{
for (i = low; i < high; i++) {...}
}
}

The message is:

93, Accelerator restriction: size of the GPU copy of an array depends on values computed in this loop
Accelerator region ignored


Thank you.
Tan.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Wed Sep 08, 2010 3:40 pm    Post subject: Reply with quote

Hi Tan,

Take out the outer acc region and it should work.
Code:
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int Numprocs =2;
Num_GPUs=2;
#pragma acc region   <<<< Take this out
{
acc_set_device_num(rank%Num_GPUs, acc_device_default);
low = rank*(N / Numprocs );
high = low + N/Numprocs ;
#pragma acc region
{
for (i = low; i < high; i++) {...}
}
}


Hope this helps,
Mat
Back to top
View user's profile
tannguyen



Joined: 26 Jul 2010
Posts: 11

PostPosted: Wed Sep 08, 2010 4:13 pm    Post subject: Reply with quote

Sorry for this mistake. In the actual code I just use 1 directive.

#pragma acc region
acc_set_device_num(rank%2, acc_device_default);
low = rank*(N / 2);
high = low + N/2;
for(i=low; i< high; i++)
vector[i]= vector1[i] * vector2[i];
}

As your suggest, I took out the "acc_set_device_num" but the error was still there.

94, Accelerator restriction: size of the GPU copy of an array depends on values computed in this loop
Accelerator region ignored


Tan.
Back to top
View user's profile
tannguyen



Joined: 26 Jul 2010
Posts: 11

PostPosted: Wed Sep 08, 2010 4:20 pm    Post subject: Reply with quote

Just want to correct the code:

#pragma acc region
{
acc_set_device_num(rank%2, acc_device_default);
low = rank*(N / 2);
high = low + N/2;
for(i=low; i< high; i++)
vector[i] = vector1[i] * vector2[i];
}
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5815
Location: The Portland Group Inc.

PostPosted: Wed Sep 08, 2010 4:24 pm    Post subject: Reply with quote

Hi Tan,

You accidentally removed the inner pragma not the outer.

The problem is that the compiler will implicitly copy in your array at the start of an accelerator region. However, the size of the array is computed using the loop bound variables 'low' and 'high' which are computed within the accelerator region. Hence, the compiler doesn't know how much of the array to copy over.

Code:
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int Numprocs =2;
Num_GPUs=2;
acc_set_device_num(rank%Num_GPUs, acc_device_default);
low = rank*(N / Numprocs );
high = low + N/Numprocs ;
#pragma acc region   // implicitly copy the array once low and high are known.
{
for (i = low; i < high; i++) {...}
}


Note that you can also use the copy clauses ('copy', 'copyin', 'copyout') to define how much of the array to copy over.

Code:
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int Numprocs =2;
Num_GPUs=2;
acc_set_device_num(rank%Num_GPUs, acc_device_default);
low = rank*(N / Numprocs );
high = low + N/Numprocs ;
#pragma acc region  copyin(myarrayname[0:N-1])
{
for (i = low; i < high; i++) {...}
}


- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group