PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

CPU parallel and accelerator regions in the same program
Goto page Previous  1, 2, 3
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
njustn



Joined: 09 Nov 2011
Posts: 22

PostPosted: Mon Nov 14, 2011 2:20 pm    Post subject: Reply with quote

I'm going to assume that the protracted silence means nobody is interested in taking a look while we're all in seattle. Is there anything I can provide that might help looking into this? Statically linked binary for example, or anything like that?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Fri Nov 18, 2011 11:57 am    Post subject: Reply with quote

Hi njustn,

As we talked about at SC11, the next step will be for us to get Chaos Linux installed here to see if we can recreate the problem.

- Mat
Back to top
View user's profile
njustn



Joined: 09 Nov 2011
Posts: 22

PostPosted: Tue Jul 17, 2012 9:04 pm    Post subject: Reply with quote

Hi,

On a whim, I decided to try this test again on a system I've been using for some months now. I should say I've been using it with applications that have both accelerator regions and regular parallel ones, and was actually trying to track down an issue that causes a segfault whenever you nest an "acc region" inside a parallel region without a data region as a buffer between (will post another thread about that if I can reduce the problem down), but to my utter bewilderment this one fails *exactly* the same way on this system, running Ubuntu lucid and the pgi 12.5 compiler, as on the original chaos based system. Thought it would be a good place to start reducing the other problem, but somehow the tiny example program copied below still gives me exactly the same error, despite the fact that I have other programs that work just fine using both.

compiled with:

Code:

pgcc -mp=allcores  -O3 -fast -Minfo=accel,mp  -DPGI -I/opt/pgi/linux86-64/2012/cuda/4.1/include -I/opt/pgi/linux86-64/2012/include_acc -ta=nvidia,keepgpu,keepptx,nofma -c99 -L/opt/pgi/linux86-64/2012/cuda/4.1/lib64 -lcuda -lcudart -lm -ldl -lcolamd /usr/lib/liblpsolve55.a   -o test test.c


code:
Code:

#define SIZE 100
int main(int argc, char * argv[])
{
    int stuff[SIZE];
    int limit = omp_get_thread_limit();
    printf("limit:%d\n", limit);
#pragma omp parallel shared(stuff)
    {
        int tid = omp_get_thread_num();
        printf("thread_id:%d\n", tid);
    }
    int i;
#pragma acc region for copy(stuff)
    for(i = 0; i<SIZE; i++)
    {
        stuff[i] = 1;
    }
    return 0;
}


After running into that again, I tried a few things to see what made it work in my other applications, it appears to be that the function using an acc region has to be in a different c file... I have no clue whatsoever why, but while the above fails, this version works.

compile with:
Code:
pgcc -mp=allcores  -O3 -fast -Minfo=accel,mp  -DPGI -I/opt/pgi/linux86-64/2012/cuda/4.1/include -I/opt/pgi/linux86-64/2012/include_acc -ta=nvidia,keepgpu,keepptx,nofma -c99 -L/opt/pgi/linux86-64/2012/cuda/4.1/lib64 -lcuda -lcudart -lm -ldl -lcolamd /usr/lib/liblpsolve55.a   -o test test.c test2.c


test.c:
Code:

#define SIZE 100
void arbitraryFunc(int stuff[SIZE]);
int main(int argc, char * argv[])
{
    int stuff[SIZE] = {0};
    int limit = omp_get_thread_limit();
    printf("limit:%d\n", limit);
#pragma omp parallel
    {
        int tid = omp_get_thread_num();
        printf("thread_id:%d\n", tid);
    }
    arbitraryFunc(stuff);
    return stuff[0];
}


test2.c
Code:

#define SIZE 100
void arbitraryFunc(int stuff[SIZE]){
    int i;
#pragma data region copy(stuff)
    {
#pragma acc region for
        for(i = 0; i<SIZE; i++)
        {
            stuff[i] = 1;
        }
    }
}


Having said that I have another program, which I can send along through another channel if you like, that works with all of it in the same file with both nested together. Has anyone else run into this?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Wed Jul 18, 2012 12:08 pm    Post subject: Reply with quote

Quote:
Has anyone else run into this?
Yes. We got a similar report on July 6th where the user's code was getting a seg fault when using an OpenMP region preceded by an OpenACC region. I filed this as TPR#18802 and show it no longer occurs in our 12.6 pre-release compilers.

Though, for some reason I'm not able to recreate the issue using the code you posted. It seems to work for me no matter what compiler version or system I use. For now, let's assume your issue is the same as TPR#18802 and that it will be fixed in 12.6. If it still fails for you 12.6, let me know and I'll pursue it further.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2, 3
Page 3 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group