PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

Problem with -ta=nvidia,time

 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
PaulWoodhams



Joined: 27 Nov 2009
Posts: 2

PostPosted: Wed Feb 17, 2010 4:35 am    Post subject: Problem with -ta=nvidia,time Reply with quote

I'm trying to follow the examples in one of the PGI Insider articles (http://www.pgroup.com/lit/articles/insider/v1n1a1.htm) and have run into some problems when trying to get the time profiling information. The code runs as expected but no timing information is printed to the screen. I've tried both pgfortran and pgcc and have the same problem in both.

My compile lines are:

pgcc -o c2.exe c2.c -ta=nvidia,time -Minfo
pgfortran -o f2.exe f2.f90 -ta=nvidia,time -Minfo

I've checked that kvsetta.o exists in /opt/pgi/linux86-64/10.0/lib as well.

Any advice would be much appreciated.

Thanks,

Paul
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Wed Feb 17, 2010 7:48 pm    Post subject: Reply with quote

Hi Paul,

I'm not sure. Can you please post the full verbose output (i.e. add the "-v" flag) of your compilation of c2.c as well as the output of your c2.exe run? Also, please set "ACC_NOTIFY" to "1" in your environment before running c2.exe.

Thanks,
Mat
Back to top
View user's profile
PaulWoodhams



Joined: 27 Nov 2009
Posts: 2

PostPosted: Thu Mar 11, 2010 3:22 am    Post subject: Reply with quote

Hi Mat,

Sorry for the delay. Here is the extra information you asked for.

1027> pgcc -o c2.exe c2.c -v -ta=nvidia,time -Minfo

/opt/pgi/linux86-64/10.0/bin/pgc c2.c -opt 2 -x 119 0xa10000 -x 122 0x40 -x 123 0x1000 -x 127 4 -x 127 17 -x 19 0x400000 -x 28 0x40000 -x 120 0x10000000 -x 70 0x8000 -x 122 1 -quad -x 59 4 -x 59 4 -tp nehalem-64 -x 120 0x1000 -astype 0 -stdinc /opt/pgi/linux86-64/10.0/include:/usr/local/include:/usr/lib/gcc/x86_64-redhat-linux/4.3.2/include:/usr/lib/gcc/x86_64-redhat-linux/4.3.2/include:/usr/include -def unix -def __unix -def __unix__ -def linux -def __linux -def __linux__ -def __NO_MATH_INLINES -def __x86_64__ -def __LONG_MAX__=9223372036854775807L -def '__SIZE_TYPE__=unsigned long int' -def '__PTRDIFF_TYPE__=long int' -def __THROW= -def __extension__= -def __amd64__ -def __SSE__ -def __MMX__ -def __SSE2__ -def __SSE3__ -def __SSSE3__ -predicate '#machine(x86_64) #lint(off) #system(posix) #cpu(x86_64)' -def _ACCEL=200905 -cmdline '+pgcc c2.c -o c2.exe -v -ta=nvidia,time -Minfo' -x 123 0x80000000 -x 123 4 -x 119 0x20 -alwaysinline /opt/pgi/linux86-64/10.0/lib/libintrinsics.il 4 -x 120 0x200000 -x 163 0x10001 -accel nvidia -x 163 128 -x 163 0x4000 -x 0 0x1000000 -x 2 0x100000 -x 0 0x2000000 -x 161 0xcff7 -x 162 0xcff7 -asm /tmp/pgcc7TuhXcZc_2W1.s
PGC-I-0222-Redundant definition for symbol __THROW (/usr/include/sys/cdefs.h: 63)
PGC-I-0222-Redundant definition for symbol __extension__ (/usr/include/sys/cdefs.h: 334)
executing /opt/pgi/linux86-64/10.0/bin/pgnvd /tmp/pgacc3UuhLh8KuweO.gpu -ptx /tmp/pgaccVUuhne9JCmhV.ptx -o /tmp/pgaccNUuh1LfxNXGI.bin -dp
main:
32, Generating copyin(a[0:n-1])
Generating copyout(r[0:n-1])
34, Loop is parallelizable
Accelerator kernel generated
34, #pragma acc for parallel, vector(256)
Using register for 'a'
PGC/x86-64 Linux 10.0-0: compilation completed with informational messages

/usr/bin/as /tmp/pgcc7TuhXcZc_2W1.s -o /tmp/pgcctTuh5yRIfnk3.o

/usr/bin/ld /usr/lib64/crt1.o /usr/lib64/crti.o /opt/pgi/linux86-64/10.0/lib/trace_init.o /usr/lib/gcc/x86_64-redhat-linux/4.3.2/crtbegin.o -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /opt/pgi/linux86-64/10.0/lib/pgi.ld -L/opt/pgi/linux86-64/10.0/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.3.2 /tmp/pgcctTuh5yRIfnk3.o -rpath /opt/pgi/linux86-64/10.0/lib -rpath /opt/pgi/linux86-64/10.0/cuda/lib -o c2.exe -lacc1 -ldl -lnspgc -lpgc -lm -lgcc -lc -lgcc /usr/lib/gcc/x86_64-redhat-linux/4.3.2/crtend.o /usr/lib64/crtn.o
Unlinking /tmp/pgcc7TuhXcZc_2W1.s
Unlinking /tmp/pgcctTuh5yRIfnk3.o
paulw@colenso@Thu 11 Mar 10:14:43 /fserver/paulw/GPU/pgi/part1

And when running the executable:

1032> setenv ACC_NOTIFY 1
paulw@colenso@Thu 11 Mar 10:14:43 /fserver/paulw/GPU/pgi/part1 1033> ./c2.exe
launch kernel file=/fserver/paulw/GPU/pgi/part1/./c2.c function=main line=34 device=0 grid=391 block=256
100000 iterations completed
1550 microseconds on GPU
2887 microseconds on host
paulw@colenso@Thu 11 Mar 10:14:43 /fserver/paulw/GPU/pgi/part1

Thanks,

Paul
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Thu Mar 11, 2010 11:04 am    Post subject: Reply with quote

Hi Paul,

I believe this was a short lived problem in version 10.0-0 that was corrected in the 10.0-1 patch release. The "kvsetta.o" object is missing from the link line. To work around the issue, please manually add /opt/pgi/linux86-64/10.0/lib/kvsetta.o to your link.

Note that there were a few major issues with the 10.0 release so I would suggest to upgrade to the latest version.

Hope this helps,
Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group