PGI User Forum
 SearchSearch   MemberlistMemberlist     RegisterRegister   ProfileProfile    Log inLog in 

CUDA-x86.

address set to 0x0 after F77 calls C
Goto page 1, 2  Next
 
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling
View previous topic :: View next topic  
Author Message
kstephens



Joined: 22 Jun 2005
Posts: 6

PostPosted: Mon Jul 02, 2007 3:26 pm    Post subject: address set to 0x0 after F77 calls C Reply with quote

I am using pgf77/pgcc 6.0-5 (64-bit target on x86_64 Linux) on a Xeon system running Red Hat EL4.

I have a 1980's F77 program that uses some C routines for I/O (it is a graphics library, CGS, used for another, much larger, application). An array is passed between the F77 routines. After a F77 routine calls a C routine,passing the array, the array's address is set to 0x0. However, the array address is not set to 0x0 after the same F77 routines calls another F77 routine, passing the array.

The value of other local variables or arguments are not changed after calling the C routine.

I can avoid this by using -Msave OR compiling for a 32-bit target. I have the same problem using the Intel F77 and C compilers.
Also, I cannot use -i8 for the F77. Doing so causes the entire application to seg fault --- the program's data structure is explicitly written for 4-byte integers and is nearly 300 files.

My questions are:
-- what is it about calling a C routine that causes the F77 routine to "forget" or change the array's address?
-- how can I fix this WITHOUT resorting to the steps mentioned above? (I don't want to take a performance hit.)

I've fixed many of the 32-bit to 64-bit porting issues but I'm nearing my wits-end with this thing.

Thanks,
Kenny
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6146
Location: The Portland Group Inc.

PostPosted: Mon Jul 02, 2007 4:21 pm    Post subject: Reply with quote

Hi Kenny,
Quote:
what is it about calling a C routine that causes the F77 routine to "forget" or change the array's address?

My guess is that your C routine has some assumptions about the F77 data type sizes that are no longer correct for 64-bits. For example, associating a C "long" with a F77 INTEGER. In 32-bits, a C "long" is represented in 4 bytes, but is 8 bytes in 64-bits. Also, if your C routine is storing F77 addresses in an "int" data type, then this will be incorrect for 64-bits (it should be a "long").

Can you please post the prototype for the C function as well as an example of the call from F77 and the data types of the variables being passed?

- Mat
Back to top
View user's profile
kstephens



Joined: 22 Jun 2005
Posts: 6

PostPosted: Tue Jul 03, 2007 9:53 am    Post subject: Reply with quote

You were correct about the F77 passing an address to an int and the C assuming a pointer to a long. I corrected it and avoided the seg fault.

But I'm curious why the address would get truncated to zero if it is converted from a long* to an int*. Wouldn't the lower 32-bits remain the same since they are both signed integers? Furthermore, other C routines assume long* and don't return the address as 0x0. FYI: Here is the C definition:
Code:

     void brcrea( long* iocb, char* iname, long* itype,
                  char* icl, lonf* ierr )
     {
          char name[32];
          int i;
          for ( i=0; i<31; i++ )
          {
               if ( iname[i] == ' ' )
                    break;
               name[i] = iname[i];
          }
          name[i] = '\0';
          iocb[0] = creat( name, 0644 );
          *ierr = SUCCESS;
          if ( iocb[0] <= 0 )
          {
               iocb[0] = -1;
               *ierr = FAILURE;
          }
     }

and an excerpt from the calling F77:
Code:

     subroutine gpoend(igcb)
     dimension igcb(300)
     . . .
     call brcrea( igcb(1), igcb(105), 0, ichar, ierr )


This leads to another concern. The F77 calls a C routine to get the address of a variable (I'm assuming the authors didn't have %LOC or LOC when the code was written). Here is the C routine:
Code:

     long* lbloc( long* iword )
     {
          return( (long*)( (long)iword/4 ) );
     }

and an example call from the F77:
Code:
     subroutine gasbuf( igcb, ibuff, nbuff )
     dimension igcb(300), ibuff(300)
     . . .
     igcb(103) = lbloc(ibuff)
     ...

[Don't ask about the division by 4. It is taken into account in the code because if I remove if, it crashes in serious fashion. Trust me, there's alot of 'wackiness' in this code.]

Does F77 know to store the return value as an INTEGER*8?

The array igcb is INTEGER*4 but I've not yet encountered an address that exceeds the range of an int. In time I wil convert it to INTEGER*8 to be more correct but a large bulk of the code implicitly assumes the array is INTEGER*4.

Thanks,
Kenny
Back to top
View user's profile
mkcolg



Joined: 30 Jun 2004
Posts: 6146
Location: The Portland Group Inc.

PostPosted: Tue Jul 03, 2007 3:10 pm    Post subject: Reply with quote

Hi Kenny,

Code:
But I'm curious why the address would get truncated to zero if it is converted from a long* to an int*. Wouldn't the lower 32-bits remain the same since they are both signed integers? Furthermore, other C routines assume long* and don't return the address as 0x0.

In looking over what you've presented, I'm at a loss as well. In oder for igcb's address in the caller routine, gpoend, to be set to zero, something in the C routine would need to be overwriting the stack. However, brcrea looks fairly benign. My next step would be to compile the program at "-g" and use the PGI debugger pgdbg to step though the code trying to pin-point where igcb's address changes. Valgrind (http://www.valgrind.org) might be helpful as well.

As for the second example, I would be concerned too. While I think it might "work", I'm concerned about storing addresses in an INTEGER*4 variable. While you know this already, you should really be using INTEGER*8 here.

Quote:
Does F77 know to store the return value as an INTEGER*8?

Sure, provided that you've explicitly declared the function to be INTEGER*8 or use "-i8" which changes the default kind. Ex. "INTEGER*8 ibloc"

Given the state of this program, do you really need to compile it in 64-bits?

- Mat
Back to top
View user's profile
kstephens



Joined: 22 Jun 2005
Posts: 6

PostPosted: Mon Jul 09, 2007 8:35 am    Post subject: Reply with quote

Mat,
I've already debugged the code: As for when the address of igcb turns to 0x0, it occurs immediately upon return to the F77 routine. It's values does not change throughout the C routine.

Concerning your 64-bit question, I've been debating that with a colleague. Is there any advantage to a 64-bit executable vs. the corresponding 32-bit executable (except for greater range of memory addresses which I don't think is pertinent for this code) such as speed or accuracy or anything else?

Kenny
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    PGI User Forum Forum Index -> Debugging and Profiling All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © phpBB Group