PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

pgprof: FileError.File 'pgprof.out'

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling
View previous topic :: View next topic  
Author Message
ManuelICVT



Joined: 14 May 2013
Posts: 2

PostPosted: Mon Jun 10, 2013 10:07 am    Post subject: pgprof: FileError.File 'pgprof.out' Reply with quote

Hello,

I'm trying to profile an openacc accelerated code and get the following error while opening the profile with pgprof:

Code:
>> pgprof: FileError.File 'pgprof.out', Line 147, 'p' token expected <<


If I compile and run the code with acc disabled, it works fine.

Here is my compile and run:

Code:
>> pgcc -o program -acc -ta=nvidia:cc35,time -Minfo=ccff *.c -lm -L/usr/lib64/nvidia
>> pgcollect -time program
>> pgprof -exe program


I have not compiler errors and pgcollect exits without error.

Could you please help me to find the error? Thanks.


Here is the writen pgprof.out:

Code:
PROF NODALL 0 program 1370883638 1370883630
h XXX.icvt.uni-stuttgart.de 30063 0 1 0.000000
I 6 GenuineIntel nehalem-64 linux86-64 6 2 3192
t 2 479
E 1 Seconds
<accelinfo>
  <head>
      <item>
        <label>CUDA Driver Version</label>
        <value>5050</value>
      </item>
      <item>
        <label>NVRM version</label>
        <value>NVIDIA UNIX x86_64 Kernel Module  319.23  Thu May 16 19:36:02 PDT 2013</value>
      </item>
  </head>
  <body>
    <device>
      <item>
        <label>CUDA Device Number</label>
        <value>0</value>
      </item>
      <item>
        <label>Device Name</label>
        <value>GeForce GTX TITAN</value>
      </item>
      <item>
        <label>Device Revision Number</label>
        <value>3.5</value>
      </item>
      <item>
        <label>Global Memory Size</label>
        <value>6441730048</value>
      </item>
      <item>
        <label>Number of Multiprocessors</label>
        <value>14</value>
      </item>
      <item>
        <label>Number of SP Cores</label>
        <value>2688</value>
      </item>
      <item>
        <label>Number of DP Cores</label>
        <value>896</value>
      </item>
      <item>
        <label>Concurrent Copy and Execution</label>
        <value>Yes</value>
      </item>
      <item>
        <label>Total Constant Memory</label>
        <value>65536</value>
      </item>
      <item>
        <label>Total Shared Memory per Block</label>
        <value>49152</value>
      </item>
      <item>
        <label>Registers per Block</label>
        <value>65536</value>
      </item>
      <item>
        <label>Warp Size</label>
        <value>32</value>
      </item>
      <item>
        <label>Maximum Threads per Block</label>
        <value>1024</value>
      </item>
      <item>
        <label>Maximum Block Dimensions</label>
        <value>1024, 1024, 64</value>
      </item>
      <item>
        <label>Maximum Grid Dimensions</label>
        <value>2147483647 x 65535 x 65535</value>
      </item>
      <item>
        <label>Maximum Memory Pitch</label>
        <value>2147483647B</value>
      </item>
      <item>
        <label>Texture Alignment</label>
        <value>512B</value>
      </item>
      <item>
        <label>Clock Rate</label>
        <value>875 MHz</value>
      </item>
      <item>
        <label>Execution Timeout</label>
        <value>Yes</value>
      </item>
      <item>
        <label>Integrated Device</label>
        <value>No</value>
      </item>
      <item>
        <label>Can Map Host Memory</label>
        <value>Yes</value>
      </item>
      <item>
        <label>Compute Mode</label>
        <value>default</value>
      </item>
      <item>
        <label>Concurrent Kernels</label>
        <value>Yes</value>
      </item>
      <item>
        <label>ECC Enabled</label>
        <value>No</value>
      </item>
      <item>
        <label>Memory Clock Rate</label>
        <value>3004 MHz</value>
      </item>
      <item>
        <label>Memory Bus Width</label>
        <value>384 bits</value>
      </item>
      <item>
        <label>L2 Cache Size</label>
        <value>1572864 bytes</value>
      </item>
      <item>
        <label>Max Threads Per SMP</label>
        <value>2048</value>
      </item>
      <item>
        <label>Async Engines</label>
        <value>1</value>
      </item>
      <item>
        <label>Unified Addressing</label>
        <value>Yes</value>
      </item>
      <item>
        <label>PGI Compiler Option</label>
        <value>-ta=nvidia,cc35</value>
      </item>
    </device>
  </body>
</accelinfo>
->LINE 147
Code:

p 0
<accelperf>
<hostname>XXX.icvt.uni-stuttgart.de</hostname>
<pid>30063</pid>
<descriptors>
<desc tag="1">
<type>int</type>
<primary_metric>true</primary_metric>
<event_name>Region Time</event_name>
<units>microseconds</units>
</desc>
<desc tag="2">
<type>int</type>
<primary_metric>false</primary_metric>
<event_name>Region Elapsed Time</event_name>
<units>microseconds</units>
</desc>
<desc tag="3">
<type>int</type>
<primary_metric>true</primary_metric>
<event_name>Kernel Device Time</event_name>
<units>microseconds</units>
</desc>
<desc tag="4">
<type>int</type>
<primary_metric>false</primary_metric>
<event_name>Kernel Elapsed Time</event_name>
<units>microseconds</units>
</desc>
<desc tag="5">
<type>int</type>
<primary_metric>false</primary_metric>
<event_name>Data Transfer Time</event_name>
<units>microseconds</units>
</desc>
<desc tag="6">
<type>int</type>
<primary_metric>false</primary_metric>
<event_name>Copyin Time</event_name>
<units>microseconds</units>
</desc>
<desc tag="7">
<type>int</type>
<primary_metric>false</primary_metric>
<event_name>Copyout Time</event_name>
<units>microseconds</units>
</desc>
<desc tag="8">
<type>int</type>
<primary_metric>false</primary_metric>
<event_name>Wait Time</event_name>
<units>microseconds</units>
</desc>
<desc tag="9">
<type>string</type>
<primary_metric>false</primary_metric>
<event_name>Block Size</event_name>
</desc>
<desc tag="10">
<type>string</type>
<primary_metric>false</primary_metric>
<event_name>Grid Size</event_name>
</desc>
</descriptors>
...
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6129
Location: The Portland Group Inc.

PostPosted: Tue Jun 11, 2013 2:16 pm    Post subject: Reply with quote

Hi ManuelICVT,

You're just a bit too early. The pgcollect was just updated to generate an XML profile for use with pgprof. However, pgprof wont officially support viewing of this profile information until next month' 13.7 release.

In the mean time, please set the environment variable "PGI_ACC_TIME=1" instead of using pgcollect to view OpenACC performance profiling.

Thanks,
Mat
Back to top
View user's profile
ManuelICVT



Joined: 14 May 2013
Posts: 2

PostPosted: Tue Jun 11, 2013 11:52 pm    Post subject: Reply with quote

Thanks Mat for the information.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group