W2601 Debugging First Steps

Within these castle walls be forged Mavens of Computer Science ...

— Merlin, The Coder

Prerequisites[edit]

W1251 Parameter Passing

Introduction[edit]

This tutorial is intended to familiarize you with using the LLVM debugger, lldb, a command-line debugger.

Research[edit]

Read Debugging
Read Call Stack

Experiment[edit]

What is Debugging?[edit]

A debugger is a utility program that allows you to run a program under development while controlling its execution and examining the internal values of variables. We think of a program running "inside" a debugger. The debugger allows us to control the execution of the program by pausing its execution and then resuming it. While paused, we can find out where we are in the program, what values variables have, reset the values of variables, etc. If a program crashes, the debugger can tell you exactly where the program crashed.

Note that by definition, a debugger cannot help us find compile-time errors, because these occur prior to the successful production of a program's binary representation. It is only after we successfully compile that we are able to use a debugger.

All computer scientists should learn the basics of debugging and how to use a debugger. It will save you literally hours of time when finding and fixing problems in your programs. The few minutes of investment you put into learning how to use a debugger will pay off tremendously in a matter of weeks. Work smart!

Sample Program[edit]

Use emacs to create the following sample program. Save it as "main.swift".

func division(dividend:Int, divisor:Int) -> Int {
    let quotient =  dividend / divisor
    return quotient
}

// This program will print the quotient of a division operation
func main() {
    var dividend : Int = 0
    var divisor : Int = 0

    dividend = dividend + 114
    divisor = divisor * 2

    let quotient = division(dividend:dividend, divisor:divisor)
    print("The quotient is: ", quotient)
}

// Invoke main
main()

Put emacs to sleep; compile and execute the program.

swiftc main.swift
./main

What happens?

Preparing for Debugging[edit]

Programs normally have to be compiled with a special option to allow debugging to take place. For swift, this is the -g option. Before proceeding, examine the size of your "main" executable file:

ls -l

NOTE: Those are lower case "L"s, not the number "1".

Write down the size of the file. Now, compile with the debugging option:

swiftc -g main.swift

Again, examine the size of the new file that was produced with the debugging option. Write down the size of this file and compare it to the version without the debugging information.

Observe

Observe, Ponder, and Journal: Section 1

Is the new file smaller or larger than the original?
Why?

The -g option causes the compiler to include information about the source file (main.swift) that is needed for debugging as part of the executable file. This causes the executable to be larger in size, and slightly slower, but allows for debugging. So when you run the debugger, you specify the executable file (not the source file) as the input to the debugger.

Starting the Debugger[edit]

Start the debugger by typing:

lldb main

Once in lldb, use the run command to start your program running. It will run until it completes, until it crashes, or until it reaches a breakpoint that you set (more on this later) -- and it will pause for input, of course. Once it finishes, you're still in lldb, so you can run it again from the beginning.

If your program requires command-line arguments, you can provide them after the run command. If you would normally run the program on the command line by entering prog1 100 test1.dat in the debugger, you would enter: run 100 test1.dat Note, however, that the main.swift that we are editing here does not need any command line parameters, so just type:

run

Where Am I? Where Did It Crash?[edit]

One of the most frustrating things about running a program is that they normally give little useful information when they crash -- usually they just say, 'segmentation fault'. Part of the reason is that by default the executable file doesn't include sufficient information about the source code (such as a source file name and line number) required to print a helpful error message. But when you run a program inside a debugger, you can easily see what the current line is when a program crashes. Type f to see the current and surrounding lines. Note that this may be more or less useful depending on where you are in the code. It may be significantly less useful if your program crashed deep inside of a library. In such cases, you may only be able to view some cryptic text. But don't despair. We can walk up through the frames which invoked the procedure until we reach our own familiar code.

More usefully, you can see a list of the function calls that led you to this point in your program. Your program may have died deep inside a function that is called many times in your program, and you need to know which sequence of nested function calls led to the failure. In the command-line mode, type bt, backtrace, to show this list. Examine the list closely. The frame at the bottom is the one that began execution of your program. Each frame up was the result of a new frame being introduced, often by a function invocation. Thus, the frame listed at the top is the most recent, or "deepest".

Helpful Hint

This command, bt, is one of the most important and useful debugging commands you'll see in this lesson.

bt

When a program stops, you can examine local variables, view lines of code, etc. that are local to that function. If you need to move up to the function that called this one, you need to move up to the higher frame using the up command to debug there. The down command moves you back down a frame towards where you started. The up and down commands let you move up and down the calling stack (of nested function calls) so you can issue debug commands about the function that's "active" at each level.

Type up until you see code that you recognize:

up

Helpful Hint

If you need to enter the same command again and again, you can just press the RETURN key instead.

Observe

Observe, Ponder, and Journal: Section 2

What is the role of the frames beyond that which you recognize?

Displaying Variables and Expressions[edit]

Use the f command to again display the source lines around the current location. Note the name of the function division and the names of the parameters dividend and divisor. We can examine the value of constants and variables by using the print (or just p if you're lazy). Try:

print dividend
p dividend
p divisor

Observe

Observe, Ponder, and Journal: Section 3

What happens if you try p division?
Why?

To see a list of all variables defined in the frame, you can use:

frame variable

Making your Program Pause: Breakpoints[edit]

One of the most fundamental things you want to do while debugging is to make the program pause at a particular line or at the start of a function. These locations in a program where execution pauses are called "breakpoints."

Helpful Hint

Breakpoints must be placed in your code at a location where code actually executes. (For example, they can't be placed on blank lines or comments.)

There are several different ways that we can specify the location for a breakpoint:

By function name: breakpoint set --name function_name
- Try it now with breakpoint set --name division
By line number: breakpoint set --line line_number
- Sometimes it is necessary to specify a file_name in addition to the line_number
- Try it now with breakpoint set --line 3

To view a list of current breakpoints, we can enter:

breakpoint list

Let's set one more breakpoint with breakpoint set --name main.

Observe

Observe, Ponder, and Journal: Section 4

What happened? Was exactly one breakpoint added? If not, why not?
On what line number was the first breakpoint added for the definition of the main function? Why?

View the current list of breakpoints. If we want to remove a breakpoint, we can use the breakpoint number from this list and breakpoint delete breakpoint_number. Try it now by specifying the breakpoint number of the breakpoint for the invocation of the main function. (It will contain text similar to: "main.swift:19") Note: In this case the breakpoint is disabled rather than deleted.

Viewing Source Code[edit]

Source code may be viewed by list. For example, we can specify a file name and a line number using the syntax list file_name:line_number. Try it now:

list main.swift:1

Controlling Execution After a Breakpoint[edit]

Let's re-run the program with the run command. lldb will ask for verification before proceeding. This time, rather than crash, the program stops at line 8. This is because we set a breakpoint in main, and line 8 is the first line inside main.

After you make your program pause, you may want to execute it line-by-line to see what it does next. There are two commands that make a program execute the next line and then pause again: next and step. The difference between these two is how they behave when the program reaches a function call. The step command steps into that function; in other words, you see the debugging session move into the called function. The next command steps over that function call, and you see the current line as the one after the function call. Both are useful, depending on what level of detail you need.

Continue executing the next command until the program crashes again. Pay close attention to the last line executed before the crash. Remember that you can press <ENTER> to repeat the last command issued.

Sometimes after you've hit a break point and are doing line-by-line execution, you want to resume normal execution until the next breakpoint is reached (or the program completes). The command to do this is continue. A useful variant on this is the finish command which finishes executing the current function and then pauses.

Exiting the Debugger[edit]

The command exit or quit can be used to exit the debugger.

Key Concepts[edit]

Key Concepts

lldb is a command-line debugger
A debugger is a utility program that allows you to run a program under development while controlling its execution and examining the internal values of variables.
Command-line arguments may be specified within lldb immediately after the run command
A breakpoint is a location in your program where execution pauses
A call stack is an ordered collection of all stack frames which are active (because they have not yet terminated)
A stack frame is the collection of all data on the stack associated with one subprogram call and generally includes the return address, argument variables passed on the stack, local variables, and saved copies of any registers that need to be restored. Included in the frame:
- The return address of the calling function, that is, after the invocation of the callee is completed, the return address specifies at which location execution should continue
- local data storage includes memory for locally-scoped variables
- arguments includes memory for the arguments being passed from the caller to the callee
- current instance pointer specifies the object being referenced (if any)
- preserved registers specify the original value of a register which may be clobbered by the callee
- enclosing subroutine context specifies a pointer to the enclosing stack frame in the event of a nested function
The top of the stack is pointed to be the processor's stack pointer and may change throughout the execution of a function
The frame pointer is a reference to the currently active frame and will not change during the execution of a function (unless another function is called, in which case it will be saved and later restored). The frame pointer provides a consistent reference to variables (e.g. arguments and locals) within the function.

Exercises[edit]

Exercises

J2601 Create a journal and answer all questions in this experience. Be sure to include all sections of the journal, properly formatted.

References[edit]

This tutorial is adapted from LLDB (University of Virginia)
LLDB Command Summary Page
Stack Frames (CMU)