PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

Free OpenACC Webinar

How to compile existing C/C++ project w/ NVIDIA GPU?
Goto page Previous  1, 2, 3  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming
View previous topic :: View next topic  
Author Message
mkcolg



Joined: 30 Jun 2004
Posts: 6218
Location: The Portland Group Inc.

PostPosted: Fri Jul 27, 2012 11:06 am    Post subject: Reply with quote

Hi Luis,

In looking at your code, if you were able rewrite the "calc_d_Ap_j" routine to C, then the loops shouldn't be too difficult to accelerate. The hardest part looks to be passing in the class data members which this routine wont have direct access to.

Once ported, your first region would look some thing like this:
Code:
#pragma acc data copy(...) // add data clauses
{
#pragma acc parallel loop gang
                for (ap=0; ap<aNgO; ap++){

                        cont = 0;

#pragma acc loop vector
                        for (am=0; am<ap; am++)
                                cont += negObs[am];

#pragma acc loop vector
                        for (an=0; an<aNgS; an++){

                                .. init reduction vars.

                                for (am=0; am<negObs[ap]; am++){
                                    ... reduction code
                                    ... note the dependency on cont
                                }
                                .. store reduction vars back to global array
                        }
                 }
}


Though, you might get better performance by precomputing the cont for each ap index. That way you can break the dependency between the loops and allow for more parallelization.

Code:

#pragma acc data copy(...) // add data clauses
{
#pragma acc parallel
{
#pragma acc loop gang
                for (ap=0; ap<aNgO; ap++){

                        contarr[ap] = 0;
#pragma acc loop vector
                        for (am=0; am<ap; am++)
                                contarr[ap] += negObs[am];
                }
}
#pragma acc parallel
{
#pragma acc loop gang collapse(2)
                for (ap=0; ap<aNgO; ap++){
                        for (an=0; an<aNgS; an++){
                                cont = contarr[ap];
                                .. init reduction vars.
#pragma acc loop vector
                                for (am=0; am<negObs[ap]; am++){
                                    ... reduction code
                                }
                                .. store reduction vars back to global array
                        }
                 }
}


This way you can parallelize both the ap and an loops in a 2-D gang (i.e. a CUDA block) and then vectorize the inner am loop. Before, you could only vectorize the first am loop and the an loop, and put the ap loop in a 1-D gang.

Granted, this probably doesn't make much sense to you yet. But hopefully it will soon.

- Mat
Back to top
View user's profile
vacaloca



Joined: 26 Jul 2012
Posts: 5

PostPosted: Fri Jul 27, 2012 11:19 am    Post subject: Reply with quote

Thanks for the insight, so if I were to just use just C syntax for the code and put everything in one main file, I would be able to use your suggestions, correct?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6218
Location: The Portland Group Inc.

PostPosted: Fri Jul 27, 2012 2:06 pm    Post subject: Reply with quote

Quote:
so if I were to just use just C syntax for the code and put everything in one main file, I would be able to use your suggestions, correct?
If you rewrote the code in C, then yes, you would be able to. Though, it wouldn't need to be all put into a main file.

You would need to modify the class methods into a set of C routines. The class data members may be a bit of a problem. Normally you'd wrap these up in a struct, but only fixed size structs can currently be used OpenACC since data must be contiguous. Granted, you only have one instance of your class, so you don't really need wrap the data into a struct.

I'm sure there will be other issues as well so ask if you get stuck.

- Mat
Back to top
View user's profile
luiset83



Joined: 24 Jul 2012
Posts: 2

PostPosted: Tue Jul 31, 2012 2:13 pm    Post subject: Reply with quote

mkcolg wrote:
Quote:
so if I were to just use just C syntax for the code and put everything in one main file, I would be able to use your suggestions, correct?
If you rewrote the code in C, then yes, you would be able to. Though, it wouldn't need to be all put into a main file.

You would need to modify the class methods into a set of C routines. The class data members may be a bit of a problem. Normally you'd wrap these up in a struct, but only fixed size structs can currently be used OpenACC since data must be contiguous. Granted, you only have one instance of your class, so you don't really need wrap the data into a struct.

I'm sure there will be other issues as well so ask if you get stuck.

- Mat


I have no problem translating all the source to C, except for the part where I read the files -- I actually need variable arrays since I am reading variable length files. I'm a bit stuck as to how to leave those methods intact and process them using pgcpp and do the rest using pgcc.

I tried an example that seemed promising posted here posted by NMTop40: http://forums.codeguru.com/showthread.php?368937-calling-c-function-from-c&s=5b02180a263e4fd256b01177ce06f262 -- however that did not seem to work with pgcc, either.

Could you provide some insight or perhaps a template into mixing C and C++ code under pgcc/pgcpp? I was hoping to use the VS C plugin, but the links I was sent do not work, so I'm stuck at the moment, unfortunately.

Edit: In the meantime I will just hard code the values to the sample data files until I see how I can accomplish the C/C++ mix... at least I can get some progress this way.

Edit 2: Seems like pgcc compiles matrices initialized with variable names, after all, so it might not be an issue, just VS2010 doesn't want to have any of that when compiling a *.c file, which is probably a good thing... anyway, I finally got the VS C plugin.. will port out code and see where I get stuck next, haha.
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6218
Location: The Portland Group Inc.

PostPosted: Wed Aug 01, 2012 1:45 pm    Post subject: Reply with quote

Hi Luis,

FYI, we have whole chapter in our user's devoted to Inter-language calling, including C to C++ and C++ to C that you might find useful. (See Chapter 13 of http://www.pgroup.com/doc/pgiug.pdf)

Quote:
anyway, I finally got the VS C plugin.. will port out code and see where I get stuck next, haha.
Good. Let us know how it goes.

- Mat
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Accelerator Programming All times are GMT - 7 Hours
Goto page Previous  1, 2, 3  Next
Page 2 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group