PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

--ptxas-options=-v Equivalent for CUDA Fortran?
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
TheMatt



Joined: 06 Jul 2009
Posts: 317
Location: Greenbelt, MD

PostPosted: Wed Mar 31, 2010 10:17 am    Post subject: --ptxas-options=-v Equivalent for CUDA Fortran? Reply with quote

With my current CUDA Fortran program, I'm at the point now where counting registers, lmem, etc., has become important (though whether I can *do* anything with that info is up to my brain). To that end, I was wondering if there was an equivalent to "--ptxas-options=-v" for pgfortran?

At the moment, I'm using -Mcuda=keepbin in the .bin (cf. .cubin) file to see this, but I was wondering if there was a more...elegant way to get this data (a la the nvcc option).

Thanks,
Matt
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Wed Mar 31, 2010 11:42 am    Post subject: Reply with quote

Hi Matt,

The short answer is no. However, we have been discussing how to cleanly pass options to the back-end Nvidia tools such as the ptxas assembler. I've sent a note to Michael to see where his team is at on this, but he's out of the office this week. I'll post a reply once I heard back from him.

Thanks,
Mat
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Wed May 19, 2010 9:20 am    Post subject: Reply with quote

Hi Matt,

FYI, as of 10.5, the output of "-Minfo=accel" will include the information from "--ptxas-options=-v".

For example:
Code:
mat_times_vec:
    194, Generating copyout(y(:,:,:,:))
         Generating copyin(x(:,:,:,:))
         Generating compute capability 1.3 binary
    195, Loop is parallelizable
    196, Loop is parallelizable
    197, Loop is parallelizable
    198, Loop is parallelizable
         Accelerator kernel generated
        195, !$acc do parallel
        196, !$acc do parallel, vector(4)
        197, !$acc do vector(4)
        198, !$acc do vector(16)
             CC 1.3 : 56 registers; 24 shared, 136 constant, 0 local memory bytes; 25% occupancy
    207, Loop is parallelizable


Thanks,
Mat
Back to top
View user's profile
Tuan



Joined: 11 Jun 2009
Posts: 233

PostPosted: Wed Sep 01, 2010 5:58 pm    Post subject: Reply with quote

mkcolg wrote:
Hi Matt,

FYI, as of 10.5, the output of "-Minfo=accel" will include the information from "--ptxas-options=-v".

For example:
Code:
mat_times_vec:
    194, Generating copyout(y(:,:,:,:))
         Generating copyin(x(:,:,:,:))
         Generating compute capability 1.3 binary
    195, Loop is parallelizable
    196, Loop is parallelizable
    197, Loop is parallelizable
    198, Loop is parallelizable
         Accelerator kernel generated
        195, !$acc do parallel
        196, !$acc do parallel, vector(4)
        197, !$acc do vector(4)
        198, !$acc do vector(16)
             CC 1.3 : 56 registers; 24 shared, 136 constant, 0 local memory bytes; 25% occupancy
    207, Loop is parallelizable


Thanks,
Mat


Hi Mat,
How can we get the same output (e.g. the amount of registers per kernel, shared memory used per block...) in Fortran CUDA. Is this only available with Accelerator?

Thanks,
Tuan
Back to top
View user's profile
TheMatt



Joined: 06 Jul 2009
Posts: 317
Location: Greenbelt, MD

PostPosted: Thu Sep 02, 2010 5:12 am    Post subject: Reply with quote

Tuan wrote:
Hi Mat,
How can we get the same output (e.g. the amount of registers per kernel, shared memory used per block...) in Fortran CUDA. Is this only available with Accelerator

You can add ptxinfo to your -Mcuda= string to get a similar analysis:
Code:
ptxas info    : Compiling entry function 'irrad'
ptxas info    : Used 63 registers, 16012+0 bytes lmem, 36+16 bytes smem, 7536 bytes cmem[0], 468 bytes cmem[1]

(NB: I don't have access to a Fermi yet, so I don't know if it works with it.)
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group