CS 50 Software Design and Implementation

Lecture 8

gdb – Debugging buggy code with the GNU debugger

In the this lecture, we will further demonstrate the use of gdb. People write buggy code. Many students just use printf to debug their code – this is a mistake. Why? Because it’s not a smart way to debug, nor can you find difficult bugs using printf. So to need a more powerful tool; that’s gdb. Start using it now.

In this lecture on debugging we draw from “Linux Programming” by Matthew and Stones in terms of the material and coding examples from their chapter on debugging. References and pointers to these books can be found on the course webpage.

Here is a quick reference for the most common and useful gdb commands

BTW, you can skip these notes if you write bug free code ;-)

Goals

We plan to learn the following from today’s lecture:

The essential commands you need to get by

You need to become familiar with a handful of commands; here is the top 10, OK 13:

help – list commands
run – start your program [with arglist if needed]
list – show lines of source
break – set a break point in your code
next – execute next line, stepping over function calls
step – execute next line, stepping into function calls
print – print the contents of variables and data structures
frame – select frame number, point to the stack you are interested in
backtrace – display program stack and control flow
command – execute GDB command-list every time breakpoint n is reached
cont – continue the execution of the code
quit – quit the debugger

Note, we use the next command in the notes below; next will “step over” functions rather than into them (e.g., typically you would not want to step into the fgets() function). If you want to step into one of your own functions in the control flow of your code then use the step command instead of next.

A case study in debugging buggy_sort.c using the gdb debugger

We use the intentionally buggy buggy_sort.c code from “Linux Programming” by Mathew and Stones. We use gdb in a series of debugging moves that includes set breakpoints, commands, and patching running code to verify fixes to three bugs we encounter. This is a simple example that builts on our use of gdb in the course.

We study three classes of bugs: uninitialized variables, boundary/edge condition bugs and incorrect coding. We find the bugs using different techniques, fix and verify the changes.

The original sort code buggy_sort0.c looks harmless at first glance but pesky bugs lurk. Running the compiled buggy_sort0.c shows nothing!

Let’s instrument the the code to print the main data structure before and after sort runs. To do this we add -DDEBUG code to look at data structures before and after the sort is called buggy_sort1.c . Now we can see from the print outs that the sort does not work. This is our first evidence that things are not hunky-dory, an english term for OK.

We use gdb to set a breakpoint at sort and the step through. The first thing we notice is that the sort fuction does not process the list and fall out of the for loop first time through. Taking a look at the for loop statement more closely we see that it relies on a variable “s” which is not initialized. Note, s could have any value if not initialized. Theefore, could do different things on different machines. We fix this bug by initialising s=1 in buggy_sort2.c

We know that bugs happend at boundary conditions and the there is only one main data structure. Bugs lurk at edge cases. We make the arrays much larger and discover a segment fault lurking. We make the change to the main data structure in buggy_sort2.c. We use gdb to find the second bug and fixed it - for (j = 0; j < n - 1; j++) - n changed n-1. We make the changes to buggy_sort3.c

The final bug is more tricky. We run the code again and note that the sort almost works. We set a break point in sort and look at the results each time the break point is set. It is clear that the loop terminates early. We find the fix and instead of recompiling we use patching of the code. Found and patched the second fix for // n–; commented this line out. buggy_sort3.c

The good code (well, working code) is buggy_sort4.c

We are going to compile and run the code and look at the output:

First bug (buggy_sort1.c): Uninitialized variables

We add conditional code to the buggy_sort1.c so it prints out the result of calling sort(). To get the compiler to compile the conditional code we pass the -DEDUG flag in the compiliation line below. Also, notice we compile with the debugger using the “-g” flag, you can use -ggdb to.


We put the debug code to view the structure needing sorted before and after sort is called:

[atc@dhcp-212-213 l17] mygcc -g -DDEBUG -o bug buggy_sort1.c
[atc@dhcp-212-213 l17] ./bug
Test of Bubble Sort (bad code) Compiled: May  8 2008 at 01:54:58
This is line 63 of file buggy_sort2.c
TEST: Before sort array[0] = {Bill, 62}
TEST: Before sort array[1] = {Hill, 60}
TEST: Before sort array[2] = {Barack, 42}
TEST: Before sort array[3] = {John, 105}
TEST: Before sort array[4] = {W., 1}
TEST: After sort array[0] = {Barack, 42}
TEST: After sort array[1] = {W., 1}
TEST: After sort array[2] = {Hill, 60}
TEST: After sort array[3] = {Bill, 62}
TEST: After sort array[4] = {John, 105}

Clearly the sort does not work.

Let’s run gdb and check the flow.

$ mygcc -g -DDEBUG -o bug buggy_sort1.c
$ gdb bug

(gdb) list sort
18// Can you find the bugs?
19
20// FIRST BUG: Fix for unitialized variable s -- set to 1
21
22void sort(item *a, int n)
23{
24      int i = 0, j = 0;
25      int s = 1; // fix bug S now initialized to non zero
26
27      for(; i < n && s != 0; i++) {
(gdb) break sort
Breakpoint 1 at 0x40050f: file buggy_sort1.c, line 24.
(gdb) run

Breakpoint 1, sort (a=0x600ba0, n=5) at buggy_sort1.c:22
22      int i = 0, j = 0;
(gdb) n
25      for(; i < n && s != 0; i++) {
(gdb) n
  37}
(gdb) print s
$1 = 0
# s is uninitialized needs to be s = 1 in code

Second bug (buggy_sort2.c): Increase array size to see if bugs at boundary/edges

Bugs happen at boundaries. Writing one byte beyond a byte array will not be detected by the runtime system but if we increase the size of the allocation we might highlight bugs.

Reading a byte beyond a byte array in this example lets the code run somewhat operationally normally. Errors occur at boundary conditions - i.e., at the edge of the data structure. So lets increase the size of the data structure such that an illegal access will show up. We change data to a 4K array buffer.

We “define SIZE 4096” in the code below.

Any access beyond this boundary will show up at execution.


// File: buggy_sort2.c
//
// This is a buggy bubble sort with minor modifications from ‘‘Linux Programming’’
// by Mathew and Stones
// We will debug this code in class using the gdb debugger

// No let’s force the bug out into the open - but how?

#include <stdio.h>

// Reading a byte beyond a byte array in this example lets the code
// run somewhat operationally normally. Errors occur at boundary
// conditions - i.e., at the edge of the data structure. So lets
// increase the size of the data structure such that an illegal
// access will show up. Let’s change data to a 4K array buffer.
// Any access beyond this boundary will show up.

#define SIZE 4096

typedef struct {
     char data[SIZE];
     int key;
} item;

item array[] = {
     {"Bill", 62},
     {"Hill", 60},
     {"Barack", 42},
     {"John", 105},
     {"W.", 1},
};


[campbell@spruce cs23] mygcc -o debug -DDEBUG buggy_sort1.c

[campbell@spruce cs23] ./debug

Test of Bubble Sort (bad code) Compiled: May  8 2008 at 02:11:05
This is line 55 of file buggy_sort2.c
TEST: Before sort array[0] = {Bill, 62}
TEST: Before sort array[1] = {Hill, 60}
TEST: Before sort array[2] = {Barcak, 42}
TEST: Before sort array[3] = {John, 105}
TEST: Before sort array[4] = {W., 1}
Segmentation fault

We have a segment fault - we have the bug out in the open!

buggy_sort2.c: segment fault, say what!

We use code inspection and instrumentation and gdb to find the bugs.

We have a seg. fault. But where in the code?


First compile with the debugger using the ‘‘-g’’ flag, you can use -ggdb to.

mygcc -g -o bug -DDEBUG buggy_sort2.c

Now run the debugger

[campbell@spruce cs23] gdb bug
GNU gdb Red Hat Linux (6.6-45.fc8rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Using host libthread_db library "/lib64/libthread_db.so.1".
(gdb) help

List of classes of commands:

aliases -- Aliases of other commands
breakpoints -- Making program stop at certain points
data -- Examining data
files -- Specifying and examining files
internals -- Maintenance commands
obscure -- Obscure features
running -- Running the program
stack -- Examining the stack
status -- Status inquiries
support -- Support facilities
tracepoints -- Tracing of program execution without stopping the program
user-defined -- User-defined commands

Now we run the code in the debugger and the seg fault provides glues.

(gdb) run
Starting program: /net/nusers/campbell/public_html/cs23/a.out
Test of Bubble Sort (bad code) Compiled: Feb 15 2011 at 12:47:11
This is line 65 of file buggy_sort2.c
TEST: Before sort array[0] = {Bill, 62}
TEST: Before sort array[1] = {Hill, 60}
TEST: Before sort array[2] = {Barack, 42}
TEST: Before sort array[3] = {Dicky, 105}
TEST: Before sort array[4] = {W., 1}

Program received signal SIGSEGV, Segmentation fault.
0x0000000000400588 in sort (a=0x600d40, n=5) at buggy_sort2.c:43
43                       if(a[j].key > a[j+1].key) {
(gdb)

The seg. fault happens around line 43 in the file  buggy_sort2

backtrace

The backtrace command helps debug fault segments. It provides the path through the code prior to the point when the segment fault occurs.


(gdb) list 43
  38      int s = 1;  // fix for uninitialized s in code
  39
  40      for(; i < n && s != 0; i++) {
  41               s = 0;
  42              for(j = 0; j < n; j++) {
  43                       if(a[j].key > a[j+1].key) {
  44                               item t = a[j];
  45                               a[j] = a[j+1];
  46                               a[j+1] = t;
  47                               s++;

(gdb) backtrace
#0  0x0000000000400588 in sort (a=0x600d40, n=5) at buggy_sort2.c:43
#1  0x0000000000400815 in main () at buggy_sort2.c:71

# we could move around using frame if needed.

(gdb) info frame
Stack level 0, frame at 0x7fffffffe390:
 rip = 0x400588 in sort (buggy_sort2.c:43); saved rip 0x400815
 called by frame at 0x7fffffffe3c0
 source language c.
 Arglist at 0x7fffffffe380, args: a=0x600d40, n=5
 Locals at 0x7fffffffe380, Previous frame’s sp is 0x7fffffffe390
 Saved registers:
  rbx at 0x7fffffffe378, rbp at 0x7fffffffe380, rip at 0x7fffffffe388


# Note, as in the introduction to C notes we use the frame command.
# If you get a segfault and try and look at variables you might get
# something like ‘‘No symbol  in current context’’. In this case
# you need to set the frame to the right value to point to the
# right stack frame. An example, is show below.

(gdb) print s
$1 = 3
(gdb) frame 1
#1  0x0000000000400815 in main () at buggy_sort2.c:71
  71    sort(array,5);
(gdb) print array[0]
$2 = {data = ‘‘Hill’’, ’\000’ <repeats 4091 times>, key = 60}
(gdb) print s
No symbol ‘‘s’’ in current context.
(gdb) info frame
Stack level 1, frame at 0x7fffffffe3c0:
 rip = 0x400815 in main (buggy_sort2.c:71); saved rip 0x311241ee5d
 caller of frame at 0x7fffffffe390
 source language c.
 Arglist at 0x7fffffffe3b0, args:
 Locals at 0x7fffffffe3b0, Previous frame’s sp is 0x7fffffffe3c0
 Saved registers:
  rbx at 0x7fffffffe3a8, rbp at 0x7fffffffe3b0, rip at 0x7fffffffe3b8
(gdb) #need to switch back to the right frame stack
(gdb) frame 0
#0  0x0000000000400588 in sort (a=0x600d40, n=5) at buggy_sort2.c:43
  43                       if(a[j].key > a[j+1].key) {
(gdb) print n
$3 = 5
(gdb) print j
$4 = 4
(gdb)

This is a short program to the backtrace is very short. We have not called many functions. The sort is called with sort (a=0x600ca0, n=5) at buggy_sort2.c:41. If the code had a more complex trace we would be able to dig into the details and find out the control flow of the code which would be use in getting to the errors.


The offending line is:

43                             if(a[j].key > a[j+1].key) {

(gdb) print j
$1 = 4

So j = 4 !

But the array is one 0-4 elements large. But j+1 = 5! making an access
to a[5] which clearly is a access memory beyond the array. Because we
increased the size of the data element to 1024 bytes then this access exceeds
the data structure array by a significant amount and not just a byte, as
would be the case with the first version of
buggy sort.

The offending line is in the loop:

42                    for(j = 0; j < n; j++) {

The fix is:

42                    for(j = 0; j < n - 1; j++) {

you can print the structure for example

print a[0]@5
2 = {{data = "Hill", ’\0’ <repeats 4091 times>, key = 60}, {data = "Barcak", ’\0’
<repeats 4089 times>, key = 42}, {data = "Bill", ’\0’ <repeats 4091 times>, key = 62}, {
data = "W.", ’\0’ <repeats 4093 times>, key = 1}, {data = "John", ’\0’ <repeats
4091 times>, key = 105}}

Third Bug (buggy_sort3.c): Superfluous, but troublesome line in the code

OK. Let’s but the fix in above into buggy_sort3.c and run it again. Hopefully, that was the last pesky bug...




$ mygcc -g -DDEBUG -o bug buggy_sort3.c
$ ./bug
Test of Bubble Sort (bad code) Compiled: Apr 17 2012 at 12:09:14
This is line 65 of file buggy_sort3.c
TEST: Before sort array[0] = {Bill, 62}
TEST: Before sort array[1] = {Hill, 60}
TEST: Before sort array[2] = {Barack, 42}
TEST: Before sort array[3] = {John, 105}
TEST: Before sort array[4] = {W., 1}
TEST: After sort array[0] = {Barack, 42}
TEST: After sort array[1] = {W., 1}
TEST: After sort array[2] = {Hill, 60}
TEST: After sort array[3] = {Bill, 62}
TEST: After sort array[4] = {John, 105}

# nope did not work. Let’s run the debugger again

$ gdb bug

(gdb) help breakpoint
Making program stop at certain points.

List of commands:

awatch -- Set a watchpoint for an expression
break -- Set breakpoint at specified line or function
catch -- Set catchpoints to catch events
clear -- Clear breakpoint at specified line or function
commands -- Set commands to be executed when a breakpoint is hit
condition -- Specify breakpoint number N to break only if COND is true
delete -- Delete some breakpoints or auto-display expressions
delete breakpoints -- Delete some breakpoints or auto-display expressions

Lets set a break point at line 44 and look at variables.

(gdb) list 44
39void sort(item *a, int n)
40{
  41      int i = 0, j = 0;
  42      int s = 1;
  43
  44      for(; i < n && s != 0; i++) {
  45               s = 0;
  46              for(j = 0; j < n - 1; j++) {
  47                       if(a[j].key > a[j+1].key) {
  48                               item t = a[j];
(gdb)


Let’s set a breakpoint

45                     s = 0;

(gdb) break 45
Breakpoint 1 at 0x8048412: file buggy_sort3.c, line 45.
(gdb) run
Starting program: /net/nusers/campbell/public_html/cs50/bug
Test of Bubble Sort (bad code) Compiled: Feb  5 2010 at 02:03:00
This is line 63 of file buggy_sort3.c
TEST: Before sort array[0] = {Bill, 62}
TEST: Before sort array[1] = {Hill, 60}
TEST: Before sort array[2] = {Barack, 42}
TEST: Before sort array[3] = {John, 105}
TEST: Before sort array[4] = {W., 1}

Breakpoint 1, sort (a=0x80499a0, n=5) at buggy_sort3.c:45
45                s = 0;
Missing separate debuginfos, use: debuginfo-install glibc.i686
(gdb) cont
Continuing.

Breakpoint 1, sort (a=0x80499a0, n=4) at buggy_sort3.c:45
45                s = 0;
(gdb) cont
Continuing.

Breakpoint 1, sort (a=0x80499a0, n=3) at buggy_sort3.c:45
45                s = 0;
(gdb) cont
Continuing.
TEST: After sort array[0] = {Barack, 42}
TEST: After sort array[1] = {W., 1}
TEST: After sort array[2] = {Hill, 60}
TEST: After sort array[3] = {Bill, 62}
TEST: After sort array[4] = {John, 105}

Program exited normally.
(gdb)

The code completes without a seg fault now but it still does not work. The table is not ordered by the sort. But what we notice is that n=3 before it exits and the next breakpoint is not hit. Looking more closely at the code we see that the code does not execute the outer loop n times where n is 5. n is set to 3 along the way which is incorrect. Closer inspection of the code we can see that line 52 that is culprit (i.e., n–;). The input parameter n dictates how many times the inner and outter loop execute but n should not be decremented in this code because variables i and j are used to drive these loops, respectively:


(gdb) list 54
  49                               a[j] = a[j+1];
  50                               a[j+1] = t;
  51                               s++;
  52                       }
  53               }
  54              n--;
  55       }
  56}



We need to comment out this buggy line. Then the code will run.

Patching code

We can use patching to change the code without recompilation. This is very cool. It is like hot wiring cars.


$ gdb bug
GNU gdb Fedora (6.8-24.fc9)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
(gdb) break 54
Breakpoint 1 at 0x8048563: file buggy_sort3.c, line 54.
(gdb) command 1
Type commands for when breakpoint 1 is hit, one per line.
End with a line saying just "end".
>set variable n=n+1
>cont
>end
(gdb)

(gdb) run
Starting program: /net/nusers/campbell/public_html/cs23/bug
Test of Bubble Sort (bad code) Compiled: Feb  5 2010 at 02:03:00
This is line 63 of file buggy_sort3.c
TEST: Before sort array[0] = {Bill, 62}
TEST: Before sort array[1] = {Hill, 60}
TEST: Before sort array[2] = {Barack, 42}
TEST: Before sort array[3] = {John, 105}
TEST: Before sort array[4] = {W., 1}

Breakpoint 1, sort (a=0x80499a0, n=5) at buggy_sort3.c:54
54               n--;

Breakpoint 1, sort (a=0x80499a0, n=5) at buggy_sort3.c:54
54               n--;

Breakpoint 1, sort (a=0x80499a0, n=5) at buggy_sort3.c:54
54               n--;

Breakpoint 1, sort (a=0x80499a0, n=5) at buggy_sort3.c:54
54               n--;

Breakpoint 1, sort (a=0x80499a0, n=5) at buggy_sort3.c:54
54               n--;
TEST: After sort array[0] = {W., 1}
TEST: After sort array[1] = {Barack, 42}
TEST: After sort array[2] = {Hill, 60}
TEST: After sort array[3] = {Bill, 62}
TEST: After sort array[4] = {John, 105}

Program exited normally.
Missing separate debuginfos, use: debuginfo-install glibc.i686
(gdb)


Our buggy sort is no longer buggy - now works great!

We put the fix in buggy_sort4.c

Other useful gdb commands are:

step - single step through a piece of code line by line

next - step “over” a routine that you do not want to step into (by using step)

We strongly recommend that you play with these various versions of buggy_sort and go through the notes and reproduce what we did above in a step by step manner; then you can apply these simple techniques in nailing your pesky buys.