PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

EventSynchronize error

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Pebbles



Joined: 24 Sep 2010
Posts: 13

PostPosted: Tue Dec 21, 2010 4:08 pm    Post subject: EventSynchronize error Reply with quote

Hi,

When I execute the following code, I am getting an EventSynchronize error. I believe it has something to do with the IEND_MEMBER array.

PROGRAM TESTPGI
USE CONSTANTSPGI

INTEGER :: I, J, K, KK, IX, IY, IWL, OUTERLOOP, OUTERLOOPMAX, NX, NY, NWL, NB
INTEGER (KIND=KI4), ALLOCATABLE :: IEND_MEMBER(:,:)
REAL, ALLOCATABLE :: P_RAD(:,:,:), M_RAD(:,:)
REAL BL

NX = 753
!NY = 1924
NY = 500
NWL = 224
NB = 64

OUTERLOOPMAX = NX * NY

ALLOCATE(M_RAD(NWL,NB),STAT=IOS)
ALLOCATE(P_RAD(NX,NY,NWL),STAT=IOS)
ALLOCATE(IEND_MEMBER(NX,NY),STAT=IOS)

DO I = 1,NWL
DO J = 1,NX
DO K = 1,NY
P_RAD(J,K,I) = .005 ** 2
END DO
END DO
END DO

DO I = 1,NWL
DO J = 1,NB
M_RAD(I,J) = .003 ** 2
END DO
END DO

!$acc region
!$acc do private( IEND_MEMBER )
DO OUTERLOOP = 1,OUTERLOOPMAX
BL = 0.0
KK = (OUTERLOOP - 1 )/NX
IY = KK+1
IX = OUTERLOOP - KK/NX

DO IWL = 1, NWL
IF ( P_RAD(IX,IY,IWL) > 0.0 ) BL=BL + P_RAD(IX,IY,IWL)**2
END DO
IF ( BL < EPSMIN4 ) THEN
IEND_MEMBER(IX,IY)=255
BL = 1.0
ELSE
IEND_MEMBER(IX,IY)=254
END IF
END DO
!$acc end region

END PROGRAM TESTPGI

testpgi:
39, Generating copyin(pixel_radiance(:,:,1:224))
Generating compute capability 1.0 binary
Generating compute capability 1.3 binary
41, Loop is parallelizable
Accelerator kernel generated
41, !$acc do parallel, vector(256)
CC 1.0 : 9 registers; 20 shared, 84 constant, 0 local memory bytes; 100 occupancy
CC 1.3 : 9 registers; 20 shared, 84 constant, 0 local memory bytes; 100 occupancy
47, Loop is parallelizable

launch kernel file=/home/users/elliott/HypGP/testPGI.f95 function=testpgi line=41 device=0 grid=1471 block=256
call to EventSynchronize returned error 700: Launch failed
CUDA driver version: 3010

Accelerator Kernel Timing data
/home/users/elliott/HypGP/testPGI.f95
testpgi
39: region entered 1 time
time(us): init=151428
data=81636
41: kernel launched 1 times
grid: [1471] block: [256]
time(us): total=0 max=0 min=0 avg=0
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Tue Dec 21, 2010 6:48 pm    Post subject: Reply with quote

Hi Pebbles,

The issue is you're getting a segmentation violation since IX is getting set to 18443 which is beyond the range of IEND_MEMBER's first dimension's bounds. This error occurs on the host and is evident when compiling with debugging enabled (-g) and run within the debugger (PGDBG). Once you have your program working on the host, then enable the accelerator directives.

Code:
!$acc do private( IEND_MEMBER )

Be careful here. Privatizing an array means that every thread gets it's own copy. Not only will this use a lot of memory, the copies are removed once kernel has finished.

The problem with the code is that you are using computed indices and the compiler is unable to determine that all values of IX and IY are unique. The compiler must be conservative and not parallelize the loop. If you know that all values are independent, then you can you the independent clause to tell the compiler to ignore the dependency analysis and parallelize the loop anyway.

Code:
!$acc do independent


The caveat is that if there is a loop dependency, then you will get non-deterministic results.

Hope this helps,
Mat
Back to top
View user's profile
sindimo



Joined: 30 Nov 2010
Posts: 29
Location: Saudi Aramco

PostPosted: Sat Jan 08, 2011 11:16 pm    Post subject: Reply with quote

Dear Mat,

I was getting the below error whenever I used the "private" clause for some of the arrays I am processing, it just runs out of memory immediately when I launch the program:

Code:

call to cuMemAlloc returned error 2: Out of memory
CUDA driver version: 3010

Accelerator Kernel Timing data
code.f
      333: region entered 18 times
        time(us): total=29790 init=3 region=29787
                  kernels=1570 data=0
        w/o init: total=29787 max=2658 min=1655 avg=1654
        334: kernel launched 17 times
            grid: [1]  block: [2]
            time(us): total=1570 max=102 min=90 avg=92
code.f
    216: region entered 1 time
        time(us): init=1042035
                  data=1325615


I am guessing this could be as you explained due to each thread having its own copy of the privatized arrays?

What about when we use the "independent" clause, does that somehow maintain multiple copies of the arrays as well?

When I use "independent" the program runs for a while then eventually runs out of memory again. With "private", it runs out of memory immediately.

I just want to understand the difference between "private" and "independent" in terms of memory usage.

Thanks for your help.

Mohamad Sindi
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Mon Jan 10, 2011 1:58 pm    Post subject: Reply with quote

Hi Mohamad Sindi,

Quote:
I just want to understand the difference between "private" and "independent" in terms of memory usage.
By default, arrays are assumed to be shared by all threads. "Private" over rides the default causing each thread will get it's own temporary copy of the variable. So if your arrays are large, then yes you will run out of memory.

"independent" overrides the compiler's dependency analysis. This allows you to force the compiler to accelerate your code even if it finds issues that would otherwise prevent it. "independent" doesn't add any additional memory overhead.

The question for you is why do you need to use these clauses? Have you tried to remove the dependencies that are preventing acceleration?

Go Ducks!
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group