PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

tvmet, blitz large functions and inline
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
paultschi



Joined: 03 Apr 2005
Posts: 4

PostPosted: Mon Apr 04, 2005 1:01 pm    Post subject: tvmet, blitz large functions and inline Reply with quote

Hello,

I am currently evaluating pgCC. I am encountering problems however.

1) Blitz does not compile:

"../blitz/mathfunc.h", line 2410: error: no instance of overloaded function
"std::sin" matches the argument list
argument types are: (std::complex<long double>)
{ return BZ_CMATHFN_SCOPE(sin)((complex<long double> )x); }

(replace std::sin with std:: sqrt, std::sinh etc to get the full error list) Is it a Blitz or Portland problem?

2) I have a project with large math functions. Those functions are exported with CForm from Mathematica and have a from 300 to over 8000 lines of C code. The functions look like this:

double fun(const tvmet::Vector<double, 4>& x, const tvmet::Vector<double, 4>& xo)
{
300-8000 lines of math
return result
}

For some reason I get really bad performance from pgCC. First I get lots of warnings

PGCC-W-0278-Can't inline ......tvmet25vector

then I get 0.34 seconds on 40000 functions evaluations on my athlon xp. g++ gives me 0.04 seconds with no warnings. Here are my portland optimiztation flags:

-fast -tp k7 -fastsse -Mipa=fast,inline -Minfo -Mnoframe -Minline -O3 -Minline=levels:100 --no_exceptions

and here are my g++ flags
-O3 -ffast-math -fomit-frame-pointer -march=athlon-xp -pipe

Please advise on how to use the portland compiler efficiently.

Thanks,

Paul
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6119
Location: The Portland Group Inc.

PostPosted: Mon Apr 04, 2005 4:58 pm    Post subject: Reply with quote

Hi Paul,

Question 1: Compiling Blitz

The error your seeing stems from an incompatability we had on some older linux systems. To work around this, modify the "/usr/pgi/linux86/<rel>/include/CC/stl/_site_config.h" file by commenting out the following line:
Code:
#define _STLP_NO_LONG_DOUBLE 1

to
Code:
//#define _STLP_NO_LONG_DOUBLE 1


Note this is only needed for 32-bit systems. Also, this may cause other issues on some Linux distributions. One of our engineers is looking into this and hopes to have fix in a future patch release.

With this change you'll now get a different error in which you'll have multiply defined reference to "pow". To fix this, change your "blitz/config.h" include file to undefine "BZ_MATH_FN_IN_NAMESPACE_STD". I was able to successfully compile and run the example tests once I made these changes.

Question 2:

Let's try simpifing your compilation flags. The Athlon XP does not have SSE2 so you should use just "-fast". (Note that -fast is part of -fastsse so you don't need both.) "-Mnoframe" is part of "-fast" so is not needed. You should use either IPA inlining or Minline, but not both. Also, 10 levels of inlining is all you'll need. Having 100 levels will take a extremely long time to compile. So this leaves
Code:
-fast -O3 -tp k7 --no_exceptions -Minline=levels:10


Let me know if this helps your compilation time and allows more functions to be inlined.

Thanks,
Mat
Back to top
View user's profile
paultschi



Joined: 03 Apr 2005
Posts: 4

PostPosted: Mon Apr 04, 2005 9:48 pm    Post subject: Reply with quote

Hi,

thanks for your fast answer.

Question 1)

I changed the _site_config.h file. I'm now getting the mulitply defined reference to "pow" However undefining "BZ_MATH_FN_IN_NAMESPACE_STD" in blitz/config.h doesn't get rid of the errors, though.

Question 2)
With the new flags there is still no inlining happening, which is strange since I am only using operator () from tvmet. The second thing that doesn't get inlined comes from a Macro

Code:
#define Power(x, y)   (mypow<y>(static_cast<double>(x)))


where mypow is of the following form

Code:
template<int order>
inline   double mypow(double arg);

template<>
inline   double mypow<2>(double arg)
{
   return (arg * arg);
}

template<>
inline double mypow<3>(double arg)
{
   return (arg * arg * arg);
}


etc.

(the powers are guarantedd to be integers).


I very much appreciate your help,

Paul
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6119
Location: The Portland Group Inc.

PostPosted: Tue Apr 05, 2005 1:45 pm    Post subject: Reply with quote

Hi Paul,

Q1) What version of the compiler and which OS are you running? I'm on a SuSE9.0 using the PGI 6.0 compilers. It could be that a different OS will need a different "fix".

Just to double check, you changed the "#define" to "#undef", not just comment out the statement?

Q2) I took your example code and made this small test program:
x.cpp:
Code:
#include <iostream>

template<int order>
inline   double mypow(double arg);

template<>
inline double mypow<2>(double arg)
{
   return (arg * arg);
}

template<>
inline double mypow<3>(double arg)
{
   return (arg * arg * arg);
}

#define Power(x, y)   (mypow<y>(static_cast<double>(x)))

int main () {
  double res[10];
  for (int i=1; i <= 10; ++i) {
    res[i-1] = Power(i,2);
  }
  for (int i=0; i < 10; ++i) {
    cout << i+1 << "^2=" << res[i] << "\n";
  }
}


Compiled with your flags (I also added "-Mkeepasm -Manno" so we could view the assembly file):
Code:
sagebrush:/tmp% pgCC -fast -O3 -tp k7 --no_exceptions -Minline=levels:10 -Mkeepasm -Manno -Minfo x.cpp
main:
    22, Loop unrolled 4 times

Looking at the assembly we can see that "mypow" is getting inlined:

Code:
#   for (int i=1; i <= 10; ++i) {
#     res[i-1] = Power(i,2);
#   }
.LB808:
# lineno: 22

        movl    %ebx,-96(%ebp)
        fildl   -96(%ebp)
        fmul    %st(0),%st
        movl    -92(%ebp),%eax
        fstpl   -24(%eax)
        movl    %ebx,%edx
        incl    %edx
        movl    %edx,-100(%ebp)


This means that something else is inhibiting the inlining or I'm not correctly using your example. Is it possible to get the full source?

Thanks,
Mat
Back to top
View user's profile
paultschi



Joined: 03 Apr 2005
Posts: 4

PostPosted: Tue Apr 05, 2005 11:51 pm    Post subject: Reply with quote

Hi Mat,

Q1) I'm running Gentoo Linux and the newest version of PGI (6.01 ?). I downloaded last sunday. I did #undef the statement, not just comment it out, yet I_m getting the same error of the multiply defined pow's.

Q2) I can send you the full source. I hope a kdevelop project is convenient. Where can I send it to?

Thanks again for your time,

Paul
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group