PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Host memory usage increasing
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
sslgamess



Joined: 23 Nov 2009
Posts: 35

PostPosted: Sun Dec 11, 2011 10:27 am    Post subject: Host memory usage increasing Reply with quote

I'm playing around with the PGI accelerators. I noticed that when I run the accelerator version of the code below my system memory increases linearly until the program finishes.

Could someone explain why this is and what appropriate data clauses I should use to fix this?

Code:

      PROGRAM TEST
      IMPLICIT DOUBLE PRECISION (A-H,O-Z)
C
      PARAMETER (NAT=27,NRADPT=96,NLEBPT=1202,NGRIDPT=96*1202)
      PARAMETER (ONE=1.0D+00)
C
      DIMENSION
     > GTEMPA(NAT,NGRIDPT),
     > GTEMPB(NAT,NAT),
     > GTEMPC(NAT)
C
      DO IRAD=1,NRADPT
        DO IANG=1,NLEBPT
        IPTME=(IRAD-1)*NLEBPT+IANG
!$acc data region copyout(gtempa(1:nat,iptme))
!$acc region local(gtempb(1:nat,1:nat),gtempc(1:nat))
          DO IATM=1,NAT
            GTEMPC(IATM)=ONE
            DO JATM=1,NAT
              GTEMPB(JATM,IATM)=ONE
              IF(IATM .EQ. JATM) CYCLE
              GTEMPB(JATM,IATM)=DBLE(JATM+IATM)
            ENDDO
            GTEMPC(IATM)=PRODUCT(GTEMPB(1:NAT,IATM),1)
          ENDDO
          GTEMPA(1:NAT,IPTME)=GTEMPC(1:NAT)
!$acc end region
!$acc end data region
        ENDDO
      ENDDO
      WRITE(999,*) GTEMPA
      END

Code:

     15, Generating copyout(gtempa(:,iptme))
     16, Generating local(gtempb(:,:))
         Generating local(gtempc(:))
     17, Loop is parallelizable
         Accelerator kernel generated
         17, !$acc do parallel, vector(27) ! blockidx%x threadidx%x
             Using register for 'gtempc'
     19, Loop is parallelizable
     24, product reduction inlined
         Loop is parallelizable
     26, Loop is parallelizable
         Accelerator kernel generated
         26, !$acc do parallel, vector(27) ! blockidx%x threadidx%x
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 4999
Location: The Portland Group Inc.

PostPosted: Wed Dec 14, 2011 10:25 am    Post subject: Reply with quote

Hi sslgamess,

What flags are you using? I'm able to recreate the issue but only when I use the "time" profiling sub-option (-ta=nvidia,time). It seems fine without the "time" sub-option.

I've send a report to our engineers (TPR#18371) to investigate if this is a memory leak with the profiling code, expected, or something else.

Thanks,
Mat
Back to top
View user's profile
sslgamess



Joined: 23 Nov 2009
Posts: 35

PostPosted: Wed Dec 14, 2011 10:11 pm    Post subject: Reply with quote

Hi Mat,

Yes, I am using the time flag.

Thanks for submitting the TPR.
Back to top
View user's profile
Michael Wolfe



Joined: 19 Jan 2010
Posts: 36

PostPosted: Tue Feb 07, 2012 3:10 pm    Post subject: Reply with quote

We did indeed find a memory leak for programs using the -ta=nvidia,time option. It turned out the runtime was creating new cudaEvents, without recycling the old ones. That will be fixed in the 12.2 release.

Thanks.
Back to top
View user's profile
sslgamess



Joined: 23 Nov 2009
Posts: 35

PostPosted: Tue Feb 14, 2012 12:15 am    Post subject: Reply with quote

Thanks for the update Michael
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2002 phpBB Group