Debugging Sather/pSather

Introduction

Currently, we don't have a native debugger that understands Sather types and function names. (There is a project going at ICSI to patch gdb so that it "knows" about Sather). In the meantime, we have to use a C debugger which is surprisingly easy to do. Here a short description on how this works.

Summary

Note: every option described here works for pSather and Sather, unless it explicitly states that it does not.

Environment variables

START_GDB if defined, it will start gdb if a fatal error occurs (access to void, bound error, deadlock). You should compile your program with "-check all" to catch all possible errors. GDB will also come up if you 'QUIT' the program (see SIGNAL(2)). This works only for Sather programs.
DEBUG_PSATHER works only for pSather: starts for each cluster a gdb session. You need a special .gdbinit file for this, see below for more information

Compiler options

-debug Textual display of data structures and stack frames
-debug_graphical Graphical display of data structures and stack frames (does not work for pSather yet)
-debug_source step through Sather code (instead of C code)
-debug_C -pretty step through C code (sather line numbers are embedded in C comments). "-pretty" improves the readability of the generated code.

Command Summary for -debug and -debug_graphical for use in gdb

Functions ending with a G are graphical versions of the standard function.
Function Name Arguments Purpose Example
PT/PTG &FF Prints out the stack trace / The stack trace as a graphical object where each line indicates a frame p PT(&FF)
PT/PTG pFF If you are in an unbox function i.e. one that unboxes a value and callsthe real function. p PT(&FF)
PF/PFG &FF Print out current frame. In the graphical version, each line indicates a local. p PF(&FF)
PO/POG object name Print out the object, or graphically display it, one attribute per line. p PO(self)

Summary of Graphical Commands for -debug_graphical

Single click on object component: Make referenced object visible
Double click on object tag: Hide object
Drag on object tag: Move the object
Third button click on object component: Display component details in lower window
Resume button: Transfer control back to gdb
Refresh button: Re-fetch and draw all currently visible objects

Compiling your program for debugging

You have several choices, each has its advantages and disadvantages

-debug_source versus -debug_C

The first flag enables you to step through your Sather program and set breakpoints relative to the Sather source. The disadvantage is that you won't see the C source code generated, so you cannot inspect intermediat results, as you don't know the name of the locals involved. Moreover, sometimes you will get completely bogus lines (this happens for example if you use -g together with -pretty). It gets especially nasty if you use optimizations.

"-debug_C -pretty" on the other hand gives you in most cases the correct C location, and will inform you about the Sather source lines in C-comments. This is especially handy when you believe that the compiler produced bogus code.

-debug versus -debug_graphical

The first flag gives you the ability to print out Sather structures and function frames as text, and the other one will give you a graphical data structure browser. -debug slows the compilation process by about 5%, while -debug_graphical adds about 5% plus 1min (on my 4 Processor Sparc 10). Both functions also add special classes that can be called from within your program to display data structures and function frames.

Both flags add a special function that comes in handy to set breakpoints (see below). In both cases you will also get automatic backtraces whenever a fatal error occurs (bound error or void access), iff you compiled the program with -check all

-chk

This option ensures that all possible errors are caught by the program (like array bound errors, attribute access to void, and (for pSather) deadlocks. You can turn on checks for individual classes, please refer to the sacomp man page for more details.

Starting GDB

Starting the program under control of gdb

Sather

start the program under gdb as you would fo it for a C program. Example:
			gdb sather_program 
If you did not use -debug, but -debug_C you may have to add the code directory to get the C source files:
			dir sather_program.code 
You can set breakpoints, start the program and evaluate data structures.

pSather

If you run your program on only one cluster, you can use gdb the same way as for Sather, provided that your .gdbfile has the following lines (they are only needed for solaris):
			handle SIGWAITING nostop noprint
			handle SIGUSR1 nostop noprint
			handle SIGLWP nostop noprint

If you run more than one cluster, you should define the external variable DEBUG_PSATHER and run X (and make sure that DISPLAY points to the correct screen). Your .gdbinit file should look as follows:
			handle SIGWAITING nostop noprint
			handle SIGUSR1 nostop noprint
			handle SIGLWP nostop noprint
			p debug_am=0
Now you can start your program as before (not under the control of gdb) as
			sather_program
It will open up one xterm per cluster, and each xterm will be running gdb. Each cluster will be stopped in the middle of the initialization code. You can now set your breakpoints and restart each cluster with the 'cont' command (of course, you cann add the breakpoints to your .gdbinit file and restart each cluster automatically by adding the cont command to the .gdbinit file). Unfortunatly, gdb does not know about threads, so you cannot get a list of threads or look which thread waits for which lock.

A sample gdb session

For those of you not familiar with standard gdb, here's a sample gdb session debugging the raw C, not using any of the tools mentioned in the rest of this document.
(gdb) info locals
  Prints out all the locals currently visible
(gdb) print self
Prints out the pointer to the current self 
Eg: $16 = (struct FMULTIMAPSTRINT_struct *) 0x57ed8
(gdb) print *self
Prints out the value of self
$17 = {header = {tag = 4, destroyed = 0}, hsize = 3, n_targets = 50, 
  asize = 17, arr_part = {{t2 = 0x0, t1 = 0x0}}}
(gdb) print *self->arr_part@3
$15 = {"gar"}
Printing out a string/array of three elements
(gdb) bt
Print out the current contents of the stack.
(gdb) up
Go up a level in the stack. You normally start out too low (in some
abort routine). Keep going up until you hit a sather line
(gdb) down 
Go down a stack level
(gdb) break
Set a breakpoint (easier to do from within emacs)
(gdb) break sather_main
Set a break point at the sather main.
(gdb) step
Go into the next line/call
(gdb) next
Go through the next line/call
(gdb) finish
Execute till end of current function.
(gdb) continue
Continue till the next break point 

Running gdb under emacs

Emacs provides good support for debugging with gdb. Type M-x gdb in emacs and supply the sather executable. Going upwards to the correct line will take you to the sather code as soon as you hit a sather line. You can set breakpoints by hitting "C-x Space" in the sather code. TAB does symbol name completion and ESC-? provides a list of completions (useful for finding mangled function and variable names). There is currently some work being done to provide name mangling under gdb.

Setting Breakpoints

GDB allows you to set breakpoints by specifying a file and line number, or by giving a function name.

In the first approach, if you compile with -debug_C you can use a C file name. If you compile with -debug, then you must use a sather file name and line number, which can be trivially done by using C-x space in emacs. However, this won't always work. In the case of either included or parametrized classes, a single Sather source line will correspond to multiple C source lines, for every different class parametrization or every different way the code was included. Setting a break point in the Sather source will arbitrarily choose *one* of these parametrizations or inclusions! Hence, you will appear to get erratic behavior, since the break point will work in one context but not in another i.e. for one parametrization but not others. With such code, you are better off finding out the C function name, as described below, and setting the break points appropriately. Alternately, if the parametrized class calls into some other function that is not parametrized, you may set the break point in the non-parametrized class and then go "up" to the parametrized code.

The second approach requires you to know the C-function name. You can try to guess it. (it starts with the class name, followed by an underscore and the function name, followed by all argument types and the return type.) While entering the function name, you can press -? to display a list of names that start with the characters you have already entered. There is in fact an easier way to get the function name, if you used -PO or -POG. If you call the function C() with a sting, it will give you a list of all Sather functions that start with that name. Example:

	(gdb) p C("STR::") 
returns a list of all functions defined in STR (and that are used in your program). There is also the inverse function, called S(). It gives you the Sather name of a function, as in
	(gdb) p C(FSTR_create_FSTR) 
If you want to break at the very beginning of your code, you should break on sather_main. This function is the name of the main function in your MAIN class.

Viewing Commands

Backtrace

If you stop your program while it runs under gdb (either with a breakpoint, a signal or a watchpoint), you can get a list of the stack frames by typing
	(gdb) bt  
Unfortunatly, this gives you only the C function names. To get Sather function names, you have to compile the program with -PO or -POG. In this case, you can type
	(gdb) p PT(&FF)
to get an ascii listing, and
	(gdb) p PTG(&FF) 
to start the graphical data browser (T stays for 'Trace', G for 'Graphically' and FF for 'Function Frame')

Several caveats

This assumes that the local variable FF has been initiallized correctly. If you see the C Code, you have the step until you see the code
		void *_local_frame...... 
If you have passed this line you are safe. If you see Sather code, you should be safe if you stepped passed the 'is' of the function definition. You know that you should not have called this function, when your program gets a segmentation fault. However, the print code knows about those problems, and you can recover by 'cont'inuing your program in gdb.

If you are in an 'unbox' function, you have to say

		(gdb) p PT(pFF) 
or
		(gdb) p PTG(pFF) 
or, even better, single step until the actual function is called. You know that you are in an unbox function if you get a 'No symbol "FF" in current context.' error from gdb.

Function Frames

If you need to know all locals and attributes of a function frame, you can either click on the corrsponding function frame in the Graphical Data Browser, or type
	(gdb) p PF(&FF) 
The problems mentioned above are valid too.

Displaying Objects

Whenever you know the address of a reference object, you can either display it graphically, or as text dump. Use the following command in gdb:
	(gdb) p PO(object) 
or
	(gdb) p POG(object) 
If you want to display a value type, you have to know its class tag. You can get it by executing
	(gdb) p TAG("classname")
Now you can use
	(gdb) p POV(&object,tag)
This does not work for the graphical browser though (for various reasons).

Variables that control object display

There are several variables that can be set to change the way objects are displayed. Most of them affect only the text display. Setting a variable in gdb:
set print_depth = 2 
Printing a variable in gdb:
 p print_depth 
In the table below note that the variable values follow the C convention, where 1 indicates true and 0 indicates false.
Function Name Default Description
print_pointer 0 PO prints the C-pointer of reference objects
print_id 1 PO prints the ID of Sather reference objects
print_type 0 PO prints the Sather type name of attributes
print_c_type 0 PO prints the C type name of objects
print_attr 1 PO prints the Sather attribute name
print_c_attr 0 PO prints the C attribute name
print_real 1 PO prints the Sather type name of objects (same as print_type, unless the attribute is abstract. Setting both to one will print one type unless the attr type and object type are different).
print_c_real 1 PO print real C types
print_void 0 PO/POG prints void elements of arrays
print_index 10 PO The number of array elements to print
print_gdb 1 PO print the command to display data in gdb This is handy if your terminal supports cut and paste with a mouse.
print_depth 1 PO depth of the tree to be printed
print_str_len 80 PO The number of characters to be printed for strings
print_func 20 PO/POG Number of functions to print for PF/PFG
print_lines 15 (POG) Number of lines to print per object (if an object has more attributes, you can open several windows per object, each one with print_lines lines).
print_declared_source 1 PO Prints the file and line where an attribute, variable or function has been declared
print_class_source 0 PO Prints the line and file number of the class definition of attributes and variables
print_short_source: 1 PO/POG prints only the file name of sather files, not their path. If off, it will print the full path as used by the sather compiler.

Using the Graphical Interface

When using the graphical interface, control must alternate between gdb and the interface. As soon as a graphical command is issued (POG, PFG or PTG) control is transferred to the user interface. You can then manipulate anything within the user interface as described below.

Resume When you are done, you must transfer control back to the program running under gdb by hitting the "Resume" button.

Refresh The next time you come back to the user interface by calling one of the graphical functions, you can use the "Refresh" button to update the previously displayed objects. BEWARE! if some objects are no longer valid - often true of stack frames - things can and probably will break.

Quit Will terminate the gui process.

The graphical interface shows each object in its own window, with one component per line:
Object Type Invoked By Components
Stack Trace PTG Function frames
Function Frame PFG Local variables
Sather Object POG Object attributes
At the top of the window is a "tag" which indicates the object type.

If a component is a value type, its value will be displayed in place. If the component is a reference pointer there are 2 cases:

The color of the stub or arrow indicates the type of the field, as described below. An object may be hidden by double clicking on the object "tag".

An object may be moved around by dragging on its "tag" with the middle mouse button. It is frequently useful to move around objects to get reasonable placement and display of arrows.

Displaying Large Objects

Large objects are displayed a few components at a time. The bottom of the object will be a piece of red text "Next part of object". Clicking on this will reveal the next section of the array.

Viewing a component in detail

Clicking on any object component with the third mouse button will display details about the component in the lower window. The first two lines will indicate the component's name and value. The following lines indicate the

Color Scheme

Red indicates auxilliary information, not part of the object. Bluish codes for arguments and purplish for locals.

Arguments/Attributes
Value Types Dark slate blue
Built-in royal blue
Reference, Bound routines, external objects blue

Locals have a similar color scheme in reddish colors
Value types medium violet red
Built-in purple
Reference, Bound routines, external objects magenta


Calling PO/POG in Sather

If you compile your program with -PO or -POG, you can use a special Sather class named PO to change any of the variables defined above and to display objects and function frames.

To change variables, use for example

	PO::print_depth:=2 
Note that each cluster has its own copy of those variables, and if you want to change them for all clusters, you need a loop like
	loop
	   cl::=clusters!;
	   PO::print_depth(2)@cl;
	end;
To display objects textually, use
	PO::PO(object)   and   PO::POG(object)
(This works for value types too). If you want to use a special print_depth for just one object, you can say
	PO::PO(object,print_depth)
To print a function frame, use
	PO::PF   and   PO::PFG 
and a back trace can be displayed with
	PO::PT   and   PO::PTG 
If you prefer to get a string back, you can say
	s::=PO::PO(object) 
This works for PF and PT too.