PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

reduction operation
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
ks-fujii



Joined: 14 Jul 2009
Posts: 2

PostPosted: Wed Jul 15, 2009 2:15 am    Post subject: reduction operation Reply with quote

Hi,

Do you have any idea to create accelerator kernels when the programs have reduction operation?
In the case like below, I couldn't make it.

!$acc region
do i=1,n
s = s + a(i)
end do
!$acc end region

%>pgfortran -ta=nvidia -Minfo test.f
27, No parallel kernels found, accelerator region ignored
28, Loop carried scalar dependence for s

Regards
--
ks-fujii
Back to top
View user's profile
TheMatt



Joined: 06 Jul 2009
Posts: 317
Location: Greenbelt, MD

PostPosted: Wed Jul 15, 2009 6:03 am    Post subject: Reply with quote

I don't think the current accelerators support reduction. Your best bet might be using CUDPP (which does have reduction), though I'm not sure CUDPP has a Fortran interface.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Wed Jul 15, 2009 8:01 am    Post subject: Reply with quote

Hi ks-fujii,

We should have support for reductions by the November release. In the mean time, I would suggest creating an summary array to hold the intermediary calculations and then perform the reduction on the host.

For example:
Code:

!$acc region
do i=1,n
sarr(i) = a(i) * b(i) + c(i)
end do
!$acc end region

do i=1,n
s = s + sarr(i)
end do


Hope this helps,
Mat
Back to top
View user's profile
Tuan



Joined: 11 Jun 2009
Posts: 233

PostPosted: Thu Jun 10, 2010 1:11 pm    Post subject: Reply with quote

mkcolg wrote:
Hi ks-fujii,

We should have support for reductions by the November release. In the mean time, I would suggest creating an summary array to hold the intermediary calculations and then perform the reduction on the host.

Mat


Hi Mat,
Could you please tell me how we can do reduction with current APM model. I searched through the manual yet haven't found out any description.
Further, do Fortran CUDA support any similar function to perform reduction.

Thanks,
Tuan
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6134
Location: The Portland Group Inc.

PostPosted: Thu Jun 10, 2010 3:55 pm    Post subject: Reply with quote

Hi Tuan,

The PGI Accelerator Model will automatically recognize and create device code for reduction operations such as the one shown above. Unless you have a very complex reduction operation, just write in natural Fortran.

However, writing reductions in CUDA Fortran is a very complex task. Actually, writing them isn't that hard, writing them so they perform well is. Take a look at my article on writing a Monte Carlo simulation ( http://www.pgroup.com/lit/articles/insider/v2n1a4.htm). While I don't go too in-depth into reductions, I do give a brief summary on how they work.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group