|
| View previous topic :: View next topic |
| Author |
Message |
pilot117
Joined: 25 Jun 2012 Posts: 8
|
Posted: Wed Aug 08, 2012 2:32 pm Post subject: thread private common block |
|
|
If some common block is declared as thread private, could I use the openAcc to update the arrays defined in the common block?
I did some test, it seems that, whenever the common block is thread private, any updates by openAcc kernel will result in a runtime error:
| Code: |
line xxxx: cudaEventSynchronize returned status 4: unspecified launch failure
|
thanks |
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Wed Aug 08, 2012 3:41 pm Post subject: |
|
|
Hi pilot117,
There's a couple of features not yet available. Threadprivate is one, host_data and device resident the others.
- Mat |
|
| Back to top |
|
 |
pilot117
Joined: 25 Jun 2012 Posts: 8
|
Posted: Wed Aug 08, 2012 4:08 pm Post subject: |
|
|
Hi, Mat, many thanks for the answer. What do you mean "host_data and device resident the others"?
Btw, could you explain the reason a little bit more?
From my understanding: for the thread private common block, each thread will have one copy. When I do something like this:
| Code: |
subroutine mysub()
integer N,i
parameter (N=1048576)
common/blk/t1(N),t2(N),t3(N)
c$omp threadprivate (/blk/)
!$acc kernels
!$acc loop
do i=1,N
t2(i)=t1(i)*t1(i)+t2(i)*t3(N)
enddo
!$acc end kernels
return
end
|
will the openAcc see different copies of t2 or does the multiple copies cause the problem? It seems that reading t1, t2, t3 is ok. But its the writing that causes the problem. Even I only use 1 thread, I still got the same runtime error, which suggests that something mystery and deep inside the openAcc implementations. If you could explain it or direct me to some reference, that would be great!
| mkcolg wrote: | Hi pilot117,
There's a couple of features not yet available. Threadprivate is one, host_data and device resident the others.
- Mat |
|
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Fri Aug 10, 2012 9:08 am Post subject: |
|
|
Hi pilot117,
I didn't realize you we're meaning the OpenMP threadprivate. OpenACC also has a "threadprivate" directive but it's still in development.
I'm away at a conference so will ask one of our other application engineer to investigate using an OpenMP threadprivate variable within an OpenACC compute region. I've not tried it before.
- Mat |
|
| Back to top |
|
 |
pilot117
Joined: 25 Jun 2012 Posts: 8
|
Posted: Fri Aug 10, 2012 9:19 am Post subject: |
|
|
| mkcolg wrote: | Hi pilot117,
I didn't realize you we're meaning the OpenMP threadprivate. OpenACC also has a "threadprivate" directive but it's still in development.
I'm away at a conference so will ask one of our other application engineer to investigate using an OpenMP threadprivate variable within an OpenACC compute region. I've not tried it before.
- Mat |
Many thanks! Here I provide my example and compilation command in case you need a quick testing case:
| Code: |
subroutine mysub()
integer N,i
parameter (N=1048576)
common/blk/t1(N),t2(N),t3(N)
c$omp threadprivate (/blk/)
!$acc update device(t1,t2,t3)
!$acc kernels
!$acc loop
do i=1,N
t2(i)=t1(i)*t1(i)+t2(i)*t3(N)
enddo
!$acc end kernels
!$acc update host(t2)
return
end
program mainTest
integer N,i
parameter (N=1048576)
real t1(1:N),t2(1:N),t3(1:N)
common/blk/t1,t2,t3
!$acc mirror(t1,t2,t3)
do i=1,N
CALL RANDOM_NUMBER(HARVEST=X)
t1(i)=X
CALL RANDOM_NUMBER(HARVEST=X)
t2(i)=X
CALL RANDOM_NUMBER(HARVEST=X)
t3(i)=X
enddo
do j=1,10
call mysub()
end do
end program mainTest
|
the compile command:
| Code: |
pgf90 -o test -acc -mp -ta=nvidia:cc2.0,time -Minfo=accel -Mcuda -Mvect simpleTest.f
|
Here are from my output:
| Code: |
pilot@mars:~/Codes/test$ ./test
line 9: cudaEventSynchronize returned status 4: unspecified launch failure
Accelerator Kernel Timing data
/home/pilot/Codes/test/simpleTest.f
mysub
7: region entered 1 time
time(us): init=0
data=12
9: kernel launched 1 times
grid: [8192] block: [128]
time(us): total=0 max=0 min=0 avg=0
/home/pilot/Codes/test/simpleTest.f
mysub
6: region entered 1 time
time(us): init=0
data=3,441
/home/pilot/Codes/test/simpleTest.f
maintest
23: region entered 1 time
time(us): init=135,083
|
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2002 phpBB Group
|