|
| View previous topic :: View next topic |
| Author |
Message |
Jony
Joined: 05 Feb 2010 Posts: 3
|
Posted: Sun Feb 07, 2010 3:13 am Post subject: Problem with CUDA fortran simple program |
|
|
Hi, I am a beginneer of CUDA fortran and I am testing the following program. The code is compiled as pgf95 -ta=nvidia sumAB.cuf and it runs but gives me the wrong results. Any suggestion? Thanks,
!----------------module for sumAB--------------------------
module m_sumAB
use cudafor
contains
!-------------kernel subroutine-----------------
attributes(global) subroutine k_sumAB(n,A,B,C)
integer :: i
integer, value :: n
real, dimension (n) :: A,B,C
i=(blockidx%x-1)*blockdim%x+threadidx%x
if (i<=n) C(i)=A(i)+B(i)
end subroutine k_sumAB
!-------------host subrotuine--------------------
subroutine h_sumAB(n,bdim,A,B,C)
implicit none
integer :: n,bdim
real, dimension (n) :: A,B,C
real, device, dimension (n) :: Adev,Bdev,Cdev
Adev=A
Bdev=B
call k_sumAB<<<n/bdim, bdim>>>(n,Adev,Bdev,Cdev)
C=Cdev
end subroutine h_sumAB
end module m_sumAB
!---------------------------end module----------------------
program sumAB
!----------------------------------------------------
!
!purpose: sum two vector A and B of n-elements
!
!----------------------------------------------------
use m_sumAB
integer i
integer :: n=1000
integer :: bdim=100
real :: times,timef,sum
real, dimension (n) :: A,B,C,D
!-----------------end declaration variable-----------
!Initialzation arrays
A=1.2
B=2.2
C=0.
D=0.
E=0.
!CPU calculation
call cpu_time(times)
do i=1,n
D(i)=A(i)+B(i)
end do
call cpu_time(timef)
print *,'CPU time required is: ',timef-times,' seconds'
!GPU calculation
call cpu_time(times)
call h_sumAB(n,bdim,A,B,C)
call cpu_time(timef)
print *,'GPU time required is: ',timef-times,' seconds'
!diff between results
sum=0.
do i=1,n
sum=sum+C(i)-D(i)
end do
print *,'Difference between results is: ',sum,C(1),D(1)
pause
end program sumAB |
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Mon Feb 08, 2010 12:07 pm Post subject: |
|
|
Hi Jony,
I'm not sure. The program seems get correct answers when I run it.
| Code: | % pgf95 sumAB.cuf -o sumAB.out
% sumAB.out
CPU time required is: 7.1525574E-06 seconds
GPU time required is: 8.8712931E-02 seconds
Difference between results is: 0.000000 3.400000
3.400000
FORTRAN PAUSE: enter <return> or <ctrl>d to continue>
|
(Note that "-ta=nvidia" is for the Accelerator directive based model so has no effect on your code).
Can you please post more information including a sample of the output, which compiler version you're using, and which GPU you have.
Thanks,
Mat |
|
| Back to top |
|
 |
Jony
Joined: 05 Feb 2010 Posts: 3
|
Posted: Tue Feb 09, 2010 3:03 am Post subject: |
|
|
I Mat, thanks a lot for replying. I get the following answer:
| Code: | % pgf95 sumAB.cuf -o sumAB.out
% sumAB.out
CPU time required is: 0.000000 seconds
GPU time required is: 0.2650000 seconds
Difference between results is: NaN -4.2451527E+37
3.400000
FORTRAN PAUSE: continuing... |
I have downloaded and installed the PGI Workstation complete package, release 10.2, 32 bit for Windows. I have Windows Xp and my processor is a Centrino dual core. About the GPU information, I run the "cufinfo" program provided by PGI and get the following answer:
| Code: | Device Number: 0
Device Name: GeForce 9200M GE
Total Global Memory: 0.268 Gbytes
sharedMemPerBlock: 16384 bytes
regsPerBlock: 8192
warpSize: 32
maxThreadsPerBlock: 512
maxThreadsDim: 512 x 512 x 64
maxGridSize: 65535 x 65535 x 1
ClockRate: 1.300 GHz
Total Const Memory: 65536 bytes
Compute Capability Revision: 1.1
TextureAlignment: 256 bytes
deviceOverlap: F
multiProcessorCount: 1
integrated: F
canMapHostMemory: F |
|
|
| Back to top |
|
 |
mkcolg
Joined: 30 Jun 2004 Posts: 4996 Location: The Portland Group Inc.
|
Posted: Tue Feb 09, 2010 9:42 am Post subject: |
|
|
Hi Jony,
Try using the flag "-Mcuda=cc11" to tell the compiler that your device is compute capable 1.1. By default the compiler targets cc 1.3. If the works create a "$PGI/win32/10.x/bin/sitenvrc" file (replace 'x' with the actual release number) with the following line to make cc 1.1 the default. | Quote: | | set COMPUTECAP=1.1; |
- Mat |
|
| Back to top |
|
 |
Jony
Joined: 05 Feb 2010 Posts: 3
|
Posted: Thu Feb 11, 2010 5:29 am Post subject: |
|
|
Thanks a lot Mat, that's was the problem! Now it works fine :-)
Jony |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2002 phpBB Group
|