PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

High-performace linpack benchmark
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling
View previous topic :: View next topic  
Author Message
peterp



Joined: 16 Apr 2005
Posts: 11

PostPosted: Tue Jun 07, 2005 1:48 pm    Post subject: High-performace linpack benchmark Reply with quote

Hello to all,

I've been trying to compile the HPL benchmark on a myrinet/mpich opteron cluster.
Our mpich has been compiled with pgf90 and pgcc (by microway). For the life of me
i can't get past this point in the compilation:

...blah-blah...
...
pgf90 -fast -Mconcur -Minline=saxpy,sscal -Minfo -I/home/users/faculty/peterp/Test.d/new_hpl/hpl/include -I/home/users/faculty/peterp/Test.d/new_hpl/hpl/include/Linux_OPT_FBLAS -I/usr/rels/mpich/include -o /home/users/faculty/peterp/Test.d/new_hpl/hpl/bin/Linux_OPT_FBLAS/xhpl HPL_pddriver.o HPL_pdinfo.o HPL_pdtest.o /home/users/faculty/peterp/Test.d/new_hpl/hpl/lib/Linux_OPT_FBLAS/libhpl.a /usr/local/lib/libf77blas.a /usr/local/lib/libatlas.a /usr/rels/mpich/lib/libmpich.a
HPL_pddriver.o(.text+0x0): In function `main':
: multiple definition of `main'
/usr/pgi/linux86-64/5.2/lib/f90main.o(.text+0x0): first defined here
/usr/bin/ld: Warning: size of symbol `main' changed from 94 in /usr/pgi/linux86-64/5.2/lib/f90main.o to 2226 in HPL_pddriver.o
/usr/pgi/linux86-64/5.2/lib/f90main.o(.text+0x3c): In function `main':
: undefined reference to `MAIN_'
/usr/rels/mpich/lib/libmpich.a(gmpi_regcache.o)(.text+0x76f): In function `gmpi_regcache_init':
: undefined reference to `gm_hash_compare_ptrs'
/usr/rels/mpich/lib/libmpich.a(gmpi_regcache.o)(.text+0x774): In function `gmpi_regcache_init':
: undefined reference to `gm_hash_hash_ptr'
/usr/rels/mpich/lib/libmpich.a(gmpi_regcache.o)(.text+0x786): In function `gmpi_regcache_init':
: undefined reference to `gm_create_hash'
/usr/rels/mpich/lib/libmpich.a(gmpi_regcache.o)(.text+0x79c): In function `gmpi_regcache_init':
: undefined reference to `gm_create_lookaside'
/usr/rels/mpich/lib/libmpich.a(gmpi_regcache.o)(.text+0x7ec): In function `gmpi_regcache_deregister':
: undefined reference to `GM_PAGE_LEN'
/usr/rels/mpich/lib/libmpich.a(gmpi_regcache.o)(.text+0x7ff): In function `gmpi_regcache_deregister':
: undefined reference to `gm_deregister_memory'
...lots of errors of similar kind...
...etc. etc...
: undefined reference to `gm_destroy_lookaside'
make[2]: *** [dexe.grd] Error 2
make[2]: Leaving directory `/home/users/faculty/peterp/Test.d/new_hpl/hpl/testing/ptest/Linux_OPT_FBLAS'
make[1]: *** [build_tst] Error 2
make[1]: Leaving directory `/home/users/faculty/peterp/Test.d/new_hpl/hpl'
make: *** [build] Error 2

Any suggestions ?

Thanks,
Peter

ps. does PGroup offer an hpl.tar witht he makes modified for their compiler suite ?
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Tue Jun 07, 2005 4:17 pm    Post subject: Reply with quote

Hi Peter,

The 'multiple definition of main' error occurs when you link C source code containing a 'main' function using a Fortran compiler. Fortran programs need to have 'main' added at link time. Adding "-Mnomain" to the link line tells the compiler to not insert a 'main' function and will fix the problem. The other undefined reference errors are caused because your missing the GM library on your link like. Adding '-L/path/to/gm/library -lgm" should fix it. You might need to add "-lpthread" as well.

We don't have a preconfigured makefile, but I'll investigate if we can add a PGI HPL guide to our support pages. Although I haven't worked with HPL enough to know what the optimal flag set is, '-fastsse' generally gives the best performance. Also I'd remove "-Mconcur" since it is used to auto-parallelize on SMP systems.

Hope this helps,
Mat
Back to top
View user's profile
peterp



Joined: 16 Apr 2005
Posts: 11

PostPosted: Tue Jun 07, 2005 6:42 pm    Post subject: Reply with quote

I love you guys...you'll get an acnowledgement on the first paper to
come out of using this cluster...

Everything compiled fine (lots of loops unrolled...) and a small test case
run fine.

As a final bother, when you get the chance can you look over the make
I include below and please let me know if I need to change anything ?
I have no idea what should go in $CCNOOPT so I improvised...

Best regards,
Peter
________________________Here's the Make.arch___________________
# ######################################################################
#
# ----------------------------------------------------------------------
# - shell --------------------------------------------------------------
# ----------------------------------------------------------------------
#
SHELL = /bin/sh
#
CD = cd
CP = cp
LN_S = ln -s
MKDIR = mkdir
RM = /bin/rm -f
TOUCH = touch
#
# ----------------------------------------------------------------------
# - Platform identifier ------------------------------------------------
# ----------------------------------------------------------------------
#
ARCH = Linux_OPT_FBLAS
#
# ----------------------------------------------------------------------
# - HPL Directory Structure / HPL library ------------------------------
# ----------------------------------------------------------------------
#
TOPdir = $(HOME)/Test.d/new_hpl/hpl
INCdir = $(TOPdir)/include
BINdir = $(TOPdir)/bin/$(ARCH)
LIBdir = $(TOPdir)/lib/$(ARCH)
#
HPLlib = $(LIBdir)/libhpl.a
#
# ----------------------------------------------------------------------
# - Message Passing library (MPI) --------------------------------------
# ----------------------------------------------------------------------
# MPinc tells the C compiler where to find the Message Passing library
# header files, MPlib is defined to be the name of the library to be
# used. The variable MPdir is only used for defining MPinc and MPlib.
#
MPdir = /usr/rels/mpich
MPinc = -I$(MPdir)/include
MPlib = $(MPdir)/lib/libmpich.a
#
# ----------------------------------------------------------------------
# - Linear Algebra library (BLAS or VSIPL) -----------------------------
# ----------------------------------------------------------------------
# LAinc tells the C compiler where to find the Linear Algebra library
# header files, LAlib is defined to be the name of the library to be
# used. The variable LAdir is only used for defining LAinc and LAlib.
#
LAdir = /usr/local/lib
LAinc =
LAlib = $(LAdir)/libf77blas.a $(LAdir)/libatlas.a
#
# ----------------------------------------------------------------------
# - F77 / C interface --------------------------------------------------
# ----------------------------------------------------------------------
# You can skip this section if and only if you are not planning to use
# a BLAS library featuring a Fortran 77 interface. Otherwise, it is
# necessary to fill out the F2CDEFS variable with the appropriate
# options. **One and only one** option should be chosen in **each** of
# the 3 following categories:
#
# 1) name space (How C calls a Fortran 77 routine)
#
# -DAdd_ : all lower case and a suffixed underscore (Suns,
# Intel, ...), [default]
# -DNoChange : all lower case (IBM RS6000),
# -DUpCase : all upper case (Cray),
# -DAdd__ : the FORTRAN compiler in use is f2c.
#
# 2) C and Fortran 77 integer mapping
#
# -DF77_INTEGER=int : Fortran 77 INTEGER is a C int, [default]
# -DF77_INTEGER=long : Fortran 77 INTEGER is a C long,
# -DF77_INTEGER=short : Fortran 77 INTEGER is a C short.
#
# 3) Fortran 77 string handling
#
# -DStringSunStyle : The string address is passed at the string loca-
# tion on the stack, and the string length is then
# passed as an F77_INTEGER after all explicit
# stack arguments, [default]
# -DStringStructPtr : The address of a structure is passed by a
# Fortran 77 string, and the structure is of the
# form: struct {char *cp; F77_INTEGER len;},
# -DStringStructVal : A structure is passed by value for each Fortran
# 77 string, and the structure is of the form:
# struct {char *cp; F77_INTEGER len;},
# -DStringCrayStyle : Special option for Cray machines, which uses
# Cray fcd (fortran character descriptor) for
# interoperation.
#
F2CDEFS =
#
# ----------------------------------------------------------------------
# - HPL includes / libraries / specifics -------------------------------
# ----------------------------------------------------------------------
#
HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)
HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib)
#
# - Compile time options -----------------------------------------------
#
# -DHPL_COPY_L force the copy of the panel L before bcast;
# -DHPL_CALL_CBLAS call the cblas interface;
# -DHPL_CALL_VSIPL call the vsip library;
# -DHPL_DETAILED_TIMING enable detailed timers;
#
# By default HPL will:
# *) not copy L before broadcast,
# *) call the BLAS Fortran 77 interface,
# *) not display detailed timing information.
#
HPL_OPTS = -fastsse -Minline=saxpy,sscal -Minfo -lpthread
#
# ----------------------------------------------------------------------
#
HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES)
#
# ----------------------------------------------------------------------
# - Compilers / linkers - Optimization flags ---------------------------
# ----------------------------------------------------------------------
#
CC = pgcc
CCNOOPT = $(HPL_DEFS)
CCFLAGS = $(HPL_DEFS)
#
LINKER = pgf90 -Mnomain
LINKFLAGS = $(CCFLAGS) -L/opt/gm/lib64 -lgm
#
ARCHIVER = ar
ARFLAGS = r
RANLIB = echo
#
# ----------------------------------------------------------------------
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 5952
Location: The Portland Group Inc.

PostPosted: Wed Jun 08, 2005 8:26 am    Post subject: Reply with quote

Thanks Peter, we appreciate the compliment.

Whenever you see a 'NOOPT' type make variable, it's usually because the authors are either working around a bug and need to compile a particular file without optimization or don't want the compiler to optimize a file. In this case, 'CCNOOPT' is used to compile the file src/auxil/HPL_dlamch.c, which determines some machine specific arithmetic constants. Compiler optimizations can reorder operations so that the code is no longer strictly compilant to the IEEE 754 floating point arithmetic standard. The authors most likely want this file to be strictly compilant so I'd set 'CCNOOPT=-O0 -Kieee'. It should not effect the overall performance but give you more acturate results.

- Mat
Back to top
View user's profile
peterp



Joined: 16 Apr 2005
Posts: 11

PostPosted: Wed Jun 08, 2005 10:23 am    Post subject: Reply with quote

Thanks Mat, I'll try the suggestion...apart from that, can you suggest any other
flags that may be beneficial ?

Best regards,
Peter
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Programming and Compiling All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group