PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

PGF90-F-0155-Compiler failed to translate accelerator region
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
wiersma



Joined: 16 May 2013
Posts: 29

PostPosted: Wed Dec 04, 2013 9:27 am    Post subject: PGF90-F-0155-Compiler failed to translate accelerator region Reply with quote

Hi all,

I'm in the process of inlining a lot of code in hopes I can accelerate it effectively. On my first stab, I get this error though:
Code:

Stack dump:
0.      Running pass 'NVPTX DAG->DAG Pattern Instruction Selection' on function '@inner_iteration_acc_861_gpu'
pgnvd-Fatal-/opt/pgi/linux86-64/2013/cuda/5.0/nvvm/cicc TERMINATED by signal 11
Arguments to /opt/pgi/linux86-64/2013/cuda/5.0/nvvm/cicc
/opt/pgi/linux86-64/2013/cuda/5.0/nvvm/cicc -arch compute_20 -m64 -ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 /tmp/pgnvdCqWdu1rbSc9z.i -o /tmp/pgcudaforCQUduIPB8M83.ptx
PGF90-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (Pij_GPU_Acc.f90: 1)
PGF90/x86-64 Linux 13.10-0: compilation aborted


The Minfo messages are pretty straightforward - some stuff is accelerated, some stuff can't be and the normal live-out/Loop carried dependence/etc. messages are generated. I have never seen the above message though. Any ideas as to what is likely causing this?

Thanks,
Rob
Back to top
View user's profile
AROM



Joined: 03 Apr 2013
Posts: 39

PostPosted: Wed Dec 04, 2013 9:58 am    Post subject: Reply with quote

Hi Rob,

It seams I saw this issue. Try to switch to cuda5.5

Alexey
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6218
Location: The Portland Group Inc.

PostPosted: Wed Dec 04, 2013 1:35 pm    Post subject: Reply with quote

Hi Rob,

This is an error with the backend CUDA 5.0 compiler. As Alexy suggest, try using CUDA 5.5 (-ta=nvidia,cuda5.5) to see if it has been fixed.

If not, please send PGI Customer Service (trs@pgroup.com) a reproducing example so we can report it to the CUDA team and possibly find you a work around.

On a side note, 14.1 will have initial support for the OpenACC 2.0 "routine" directive which will allow you to make routine calls instead of having to inline everything.

Best Regards,
Mat
Back to top
View user's profile
wiersma



Joined: 16 May 2013
Posts: 29

PostPosted: Wed Dec 04, 2013 2:13 pm    Post subject: Reply with quote

Hi Mat,

The error persists when I use CUDA 5.5. I'll see if I can make up an example, but it may take some time to strip it down.

While I have your ear, can I ask a completely unrelated question? Sometimes I have loosely nested loops:

Code:

do i = 1, N


   !some code


   foo = 0
   do j = x1, x2
       foo = foo + bar(j)
   enddo
enddo


Now what I've tried to do to accelerate it is:
Code:

!$acc kernels
do i = 1, N   ! line 10


   !some code


   foo = 0
!$acc loop reduction(+:foo)
   do j = x1, x2    !line 30
       foo = foo + bar(j)
   enddo
enddo
!$acc end kernels


When I compile with -Minfo=accel, I get something like this:

Code:

    10, Loop is parallelizable
         Accelerator kernel generated
        10, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
    30, Loop is parallelizable


But no kernel is generated at line 30. Does it just decide that while it may be parallalizable, it's better to only work with the outer loop? Or is there some scheduling that I have to do?

Thanks,
Rob
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6218
Location: The Portland Group Inc.

PostPosted: Wed Dec 04, 2013 4:06 pm    Post subject: Reply with quote

Hi Rob,

The default for "kernels" to work on tightly nested loops so you need to add a few more loop schedule clauses. Though, this might a case where "parallel" is more fitting is it's default if for non-tightly nested loops like this one. Give this schedule a try:

Code:
!$acc parallel loop gang
do i = 1, N   ! line 10

   !some code

   foo = 0
!$acc loop vector reduction(+:foo)
   do j = x1, x2    !line 30
       foo = foo + bar(j)
   enddo
enddo
 


- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group