|
| View previous topic :: View next topic |
| Author |
Message |
Neldan
Joined: 12 Feb 2013 Posts: 11
|
Posted: Tue Mar 05, 2013 11:30 am Post subject: |
|
|
i just update to newest version of pgcc and now seems it works, but still i'm having a problem with the execution
During the execution the program print a "Invalid handle" error
my code is this:
| Code: |
int sizeR = numRows1*numRows2;
#pragma omp parallel num_threads(2) private(result)
{
int th= omp_get_thread_num();
#if _OPENACC
acc_init(acc_device_nvidia);
acc_set_device_num(th+1,acc_device_nvidia);
#endif
fprintf(stdout,"THREAD(%d) - Launched thread.\n",th);
fprintf(stdout,"THREAD(%d) - Selected device: %d\n",th,acc_get_device_num(acc_device_nvidia));
int bI = th*(numRows1/2);
int eI = numRows1/((!th)+1);
fprintf(stdout,"THREAD(%d) - begin I: %d, end I: %d\n",th,bI,eI);
int bR = th*(sizeR/2);
int eR = (sizeR/((!th)+1));
fprintf(stdout,"THREAD(%d) - size R: %d, begin R: %d, end R: %d\n",th,sizeR,bR,eR);
result = &result[bR];
#pragma acc kernels copyin(m1[0:numRows1*numColumns1],m2[0:numRows2*numColumns2]), copyout(result[0:eR-bR])
{
int i = bI;
#pragma acc loop gang vector(256), independent
for (i=0;i<eI;i++)
{
int j;
#pragma acc loop gang vector(2) independent
for(j=0;j<numRows2;j++)
{
real_t acum = 0;
int k;
for(k=0;k<numColumns1;k++) {
acum += m1[i+k*numColumns1] * m2[j*numColumns2+k];
}
result[(i-bI)*numRows1+j] = acum;
}
}
}
}
|
I use a matriz size 5000x5000
and the output is this:
| Quote: | THREAD(0) - Launched thread.
THREAD(0) - Selected device: 1
THREAD(0) - begin I: 0, end I: 50
THREAD(0) - size R: 10000, begin R: 0, end R: 5000
THREAD(1) - Launched thread.
THREAD(1) - Selected device: 2
THREAD(1) - begin I: 50, end I: 100
THREAD(1) - size R: 10000, begin R: 5000, end R: 10000
call to cuLaunchKernel returned error 400: Invalid handle
call to cuMemFree returned error 700: Launch failed
|
|
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Tue Mar 05, 2013 12:09 pm Post subject: |
|
|
Hi Neldan,
Unfortunately, all this tells me is that the kernel failed for some reason. To narrow down the issued, can you try running with a single OpenMP thread? Also, try removing the schedule clauses, i.e the gang and vector and let the compiler schedule the loop.
- Mat |
|
| Back to top |
|
 |
Neldan
Joined: 12 Feb 2013 Posts: 11
|
Posted: Tue Mar 05, 2013 12:22 pm Post subject: |
|
|
| mkcolg wrote: | Hi Neldan,
Unfortunately, all this tells me is that the kernel failed for some reason. To narrow down the issued, can you try running with a single OpenMP thread? Also, try removing the schedule clauses, i.e the gang and vector and let the compiler schedule the loop.
- Mat |
With a single openmp thread the kernel works fine |
|
| Back to top |
|
 |
Neldan
Joined: 12 Feb 2013 Posts: 11
|
Posted: Wed Mar 06, 2013 10:38 am Post subject: |
|
|
| i have been doing some test using 'fork' instead of openMP, and works fine. So i think that the problem is on the kernel's call from openMP |
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Wed Mar 06, 2013 12:01 pm Post subject: |
|
|
| Quote: | | i have been doing some test using 'fork' instead of openMP, and works fine. So i think that the problem is on the kernel's call from openMP | Ok. Can you you send a reproducible example to PGI Customer service (trs@pgroup.com) so we can determine the issue?
Thanks,
Mat |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2002 phpBB Group
|