PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

Using multiple GPUs
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
KarlW



Joined: 12 Jan 2009
Posts: 23

PostPosted: Thu Jul 30, 2009 1:03 am    Post subject: Using multiple GPUs Reply with quote

Hi,
I'm trying to run different code simultaneously on 2 GPUs. I'm under the impression that this requires OpenMP but I can't seem to get the code to work.
pgaccelinfo is picking up both devices and i am using the most recent CUDA drivers etc.

The code at the bottom of the post results in the following messages at run time and then freezes:

call gpu code
number of threads: 2
Section 1, thread: 0
test 1
number of threads: 2
Section 2, thread: 1
no devices found, exiting
launch kernel file=gpu_xyzint_1_openmptest.f90 function=gpu_xyzint_1 line=969 grid=1 block=15

!$OMP PARALLEL SHARED(pint,qint,rint)
tid = OMP_GET_THREAD_NUM()
if (tid.eq.0) then
nthreads = OMP_GET_NUM_THREADS()
end if
print *, 'number of threads:',nthreads
!$OMP SECTIONS
!$OMP SECTION
print *, 'Section 1, thread:', OMP_GET_THREAD_NUM()
print *, 'test 1'
call acc_set_device_num(0,acc_device_default)
!$acc region
!$acc do
do i=1,15
pint(i) = 0
qint(i) = 0
rint(i) = 0
end do
!$acc end region
! call gpucode(ngpu,lgpu)
!$OMP SECTION
print *, 'Section 2, thread:', OMP_GET_THREAD_NUM()
!$acc region
call acc_set_device_num(1,acc_device_default)
!$acc do
do i=16,31
pint(i) = 0
qint(i) = 0
rint(i) = 0
end do
!$acc end region
! call gpucode(ngpu,lgpu)
!$OMP END SECTIONS NOWAIT
!$OMP END PARALLEL

As you can see, I have replaced the call to a separate accelerated subroutine with some simple code. Would the call work when used in this way?


Many thanks,

Karl
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Thu Jul 30, 2009 2:02 pm    Post subject: Reply with quote

Hi Karl,

Unfortunately, support for using accelerator regions within OpenMP regions is not in 9.0 yet. (see http://www.pgroup.com/userforum/viewtopic.php?t=1490) We are actively working on adding this and, if all goes well, are expecting preliminary support in September's 9.0-4 monthly release.

Though, I not sure where the "no devices found, exiting" error is coming from. I worked up a small test case using your sample but get the error "libcuda.so not found, exiting". You're welcome to send me the code and I can see what's going on.

Thanks,
Mat
Back to top
View user's profile
KarlW



Joined: 12 Jan 2009
Posts: 23

PostPosted: Thu Jul 30, 2009 10:27 pm    Post subject: Reply with quote

Hi Mat,

I was initially getting the libcuda.so error but when I installed the latest cuda version it went away. I thought that issue arose from the installation of the second GPU though.

I've emailed the code to you.

Many thanks,

Karl

edit:
Is there any other way to run different !$acc regions simultaneously on different GPUs?
Also, are there examples anywhere on running a normal region on multiple GPUs?

Cheers!
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6208
Location: The Portland Group Inc.

PostPosted: Fri Jul 31, 2009 1:36 pm    Post subject: Reply with quote

Hi Karlw,

Quote:
Is there any other way to run different !$acc regions simultaneously on different GPUs?


You can use MPI.

Quote:
Also, are there examples anywhere on running a normal region on multiple GPUs?


Besides MPI, and in the future OpenMP or pthreads, we do not support dividing an accelerator region across multiple devices. Though, this is an evolving model so it may be possible in the future.

- Mat
Back to top
View user's profile
TheMatt



Joined: 06 Jul 2009
Posts: 322
Location: Greenbelt, MD

PostPosted: Fri Aug 07, 2009 11:47 am    Post subject: Reply with quote

mkcolg wrote:
Hi Karlw,

Quote:
Is there any other way to run different !$acc regions simultaneously on different GPUs?


You can use MPI.

- Mat

Okay, Mat, I have a question, now. How does one use MPI and !$acc together?

I currently have a big, big program that is MPI and I'm thinking of accelerating a small part of it way down in the code-tree that is 25-30% of the CPU time (and it should be fairly CUDA friendly, no intercommunication, etc.).

The CUDA testbed I'm using has 4 CPUs and a Tesla S1060 (= 4 GPUs). Thus, I have a nice one-to-one ratio. If I used the accelerator pragmas, and ran this mpirun -np 4, would it "automagically" have rank n use GPU n, or do I need to add additional logic to the code?

I'm assuming the latter, and so is there an example PGI has that shows how to do that? (Of course, I'm hoping for the former!)

Matt
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group