PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

matrix reduction using cuda fortran and GPU
Goto page Previous  1, 2, 3, 4, 5, 6, 7
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
Dolf



Joined: 22 Mar 2012
Posts: 105

PostPosted: Thu Dec 20, 2012 6:49 pm    Post subject: RE: Reply with quote

Hi Mat,

Not sure if I asked that before, but I have something strange that does not make sense to me.
so I compiled a cuda fortran code on my machine which has the PGI compiler 12.3, and cuda tool kit 4 plus GeForce GTX 460 installed. my code runs fine on my machine.
on the other hand, I have another machine, which has Tesla C1060, cuda tool kit 5 installed. but when I run the same code here, for the first time it complained about cudart64_40_17.dll which I copied from my machine to this machine. so now it does not complain, but it hangs and shows nothing.

what could be the problem you think? what exactly I need to have installed in the machine with Tesla card in order to run the cuda fortran code??

thanks,
Dolf
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Fri Dec 21, 2012 10:59 am    Post subject: Reply with quote

Hi Dolf,

What type of CPU and Windows versions are in use on both systems? I've seen this type of behaviour on Win7 systems using a Sandy-Bridge (AVX enabled) CPU. Win7 didn't begin supporting AVX until SP1.

Beyond that, I'm not sure.

- Mat
Back to top
View user's profile
Dolf



Joined: 22 Mar 2012
Posts: 105

PostPosted: Fri Dec 21, 2012 11:46 am    Post subject: RE: Reply with quote

I think you have a good point.
my machine (has the PGI compiler) have the following specs:
1. core i7-2600 processor
2. Windows 7 Ultimate SP1
3. cuda tool kit 4.0
4. GeForce GTX 460 v2 card

the other machine I want to run the code:
1. core 2 due quad core processor (Q9650)
2. Windows 7 Professional SP1
3. cuda tool kit 5.0
4. Tesla C1060 GPU card

so, could be the code I am compiling is for different amount of processors than the testing machine? how can I fix that?
is there a command in fortran to check no. of processors and divide the task between them?

what is the command in CMD that I use to get the specs of the Tesla card? I tried pgaccelinfo and it did not work since I dont have pgi compiler installed there.

thanks.
Dolf
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Fri Dec 21, 2012 1:59 pm    Post subject: Reply with quote

Quote:
so, could be the code I am compiling is for different amount of processors than the testing machine? how can I fix that?
You set the target processor flag (-tp) to use the lowest common CPU (-tp penryn-64), a generic CPU (-tp px-64), or a unified binary (-tp=sandybridge-64,penryn-64).

Quote:
is there a command in fortran to check no. of processors and divide the task between them?
For host code, there is the auto-parallelization flag (-Mconcur) which will parallelize loops (if there are no dependencies).

Quote:
what is the command in CMD that I use to get the specs of the Tesla card? I tried pgaccelinfo and it did not work since I dont have pgi compiler installed there.
While I haven't used it on Windows, you may try NVIDIA's smi utility: https://developer.nvidia.com/nvidia-system-management-interface

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2, 3, 4, 5, 6, 7
Page 7 of 7

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group