CS31 Weekly Lab: Week 12 and 13

pthreads, cpuinfo, top, debugging pthreads programs

Create a week12 subdirectory in your weeklylab subdirectory and copy over some files:

    cd cs31/weeklylab
    pwd
    mkdir week12
    ls
    cd week12
    pwd
    cp ~newhall/public/cs31/week12/* .
    ls
    Makefile  deadlock.c  racecond.c  synch.c
Then type make to build the executable files.

example pthread synchronization primatives
The file synch.c contains some examples using pthreads mutex and barrier synchronization.

CPU information on machines

top (and htop) and threads
top and htop are Unix utilities that list information about processes and threads and how they are using resources like memory and CPU.

If you run top with no command line options, then it displays per-process statistics. If you run top with -H, top will display statistics for individual threads (if you run the synch program for a large number of threads, you can see them show up in top):

top -H
htop has a slightly different display of these data, and includes per-cpu summary data. You can also sort the data on different fields using the function key (F6). Try running htop and try sorting the output on different fields:
htop

Selecting what top displays

Top can display a lot of different information about running processes and threads. Start top again:
top -H
While top is running, you can change what information it displays by typing f, and you should see something like this:
Fields Management for window 1:Def, whose current sort field is %CPU
Navigate with Up/Dn, Right selects for move then <Enter> or Left commits,
'd' or <Space> toggles display, 's' sets sort.  Use 'q' or <Esc> to end!

* PID     = Process Id          
* USER    = Effective User Name 
* PR      = Priority            
* NI      = Nice Value          
* VIRT    = Virtual Image (KiB) 
* RES     = Resident Size (KiB) 
* SHR     = Shared Memory (KiB) 
* S       = Process Status      
* %CPU    = CPU Usage           
* %MEM    = Memory Usage (RES)  
* TIME+   = CPU Time, hundredths
* COMMAND = Command Name/Line   
  PPID    = Parent Process pid  
  UID     = Effective User Id   
  RUID    = Real User Id        
...
  P       = Last Used Cpu (SMP) 
...
  nsUTS   = UTS namespace Inode 

The starred items are the current values. To select different items or units for top to display, use arrow keys then type d or space. For example, to get top to print information about the last CPU each process or thread ran on, select the P option and hit return and the top window will now have a new column P that list this information.
 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   P COMMAND          
 8249 newhall   20   0  153m 4488  472 R   47  0.0   0:02.78 6 gol              
 8250 newhall   20   0  153m 4488  472 R   47  0.0   0:02.76 2 gol              
 8236 newhall   20   0  153m 4488  472 R   46  0.0   0:02.77 6 gol              
 8237 newhall   20   0  153m 4488  472 S   46  0.0   0:02.77 4 gol              
 8243 newhall   20   0  153m 4488  472 R   46  0.0   0:02.76 1 gol              
 8239 newhall   20   0  153m 4488  472 S   46  0.0   0:02.76 7 gol              
 8240 newhall   20   0  153m 4488  472 R   46  0.0   0:02.76 5 gol              
 8244 newhall   20   0  153m 4488  472 R   46  0.0   0:02.72 2 gol              
 8251 newhall   20   0  153m 4488  472 R   46  0.0   0:02.78 4 gol              
Type q to exit top.

Let's run synch with a bunch of threads, and then top -H in another window to see what we can see.

debugging pthreads programs
Debugging threaded programs can be tricky because there are multiple streams of execution. In general, try to debug with as few threads as possible, and if you use printfs, print out a thread id and call fflush after. You can also put printf's in conditional statements to only have one of the threads print out information (or only some of the threads, or only some of the information, ...). For example, if each thread is passed a logical thread id value on start-up, and stores its value in a local variable named my_tid, then you could have logical thread 1 be the debug output printing thread to do something like:
if(my_tid == 1) {
   printf("Tid:%d: value of count is now %d my i is %d\n", my_tid,count,i); 
   fflush(stdout);
}
gdb and pthreads
We are not going to look at this together in lab, but if you want to try using gdb to debug your pthread code, here is some general information about it and an example you can try out. More information about gdb and pthreads can be found here:
gdb and pthreads

gdb has support for debugging threaded programs. One thing to keep in mind as you debug pthreaded programs on our system, is that there are at least three different identifiers for the same thread as you run it in gdb:

  1. the pthread library's id for the thread (its pthread_t value)
  2. the operating systems id for the thread (its LWP id value). This is used in part for the OS to keep track of this thread for scheduling purposes.
  3. the gdb id for the thread: this is the id you should use when specify gdb commands for a single thread.
The correspondence between the threads can differ from one OS and pthread library implementation to another, but on our systems there is a one-to-one-to-one correspondence between a pthread id, an LWP id, and a gdb thread id.

A few gdb thread-specific commands:

  set print thread-events   # prints out thread start and exit events
  info threads              # list all existing threads in program 
                            # the gdb threadno is the first value listed
                            # the thread that hit the break point is *'ed 
  thread threadno           # switch to thread threadno's context
                            # (see its stack when type where, for example)
  break [where] thread [threadno] # set a breakpoint at [where] just for 
                                  # thread threadno
                            
  thread apply [threadno|all] command  # apply the gdb command to all or a subset of threads
Bascially, in gdb you use the following prefix to a gdb command to apply a particular gdb command to all or just a subset of threads (ex. 2-5) (using its gdb thread id):
thread apply [thread_id | all]  command
This doesn't seem to work for setting breakpoints on a single thread, so use the other way:
break line_no thread thread_no

The default behavior of gdb when a thread hits a breakpoint is that all threads are suspended whereever they happen to be until the user types cont. You can change this default behavior to have threads who are not at a breakpoint continue executing while you debug the ones that hit their breakpoints (but it is hard to think of scenarios where doing this would make debugging easier, so I'd say probably stick with the default).


A simple example run

Let's try running racecond in gdb. We will set a breakpoint for all threads in worker_loop and then set a breakpoint at line 76 just for thread 3 REMEMBER gdb's thread number 3 may not correspond to a logical thread number in your program (i.e. myid may not be 3 for gdb thread 3).
$ gdb ./racecond
(gdb) delete all
(gdb) break worker_loop
(gdb) run 5
(gdb) info threads
(gdb) break 76 thread 3     # set's the breakpoint just for thread 3
(gdb) display myid
(gdb) cont ...

Here is some more output from using gdb on the racecond program that shows how to use some of the thread commands and what their output might look like:
% gdb ./racecond
  ...
(gdb) set print thread-events on
(gdb) run 5

Starting program: /home/newhall/public/cs31/week12/racecond 5
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

[New Thread 0x7ffff77fd700 (LWP 17471)]
hello I'm thread 0 with pthread_id 140737345738496
# LWP 17471: means Light Weight Process with id number 17471: 
# an LWP is a thread the OS knows about, 17471 is the OS's id number for
# the thread 140737345738496 is the pthread library's id number for the thread 

[New Thread 0x7ffff6ffc700 (LWP 17472)]
hello I'm thread 1 with pthread_id 140737337345792
[New Thread 0x7ffff67fa700 (LWP 17473)]
hello I'm thread 2 with pthread_id 140737328948992
[New Thread 0x7ffff5ff9700 (LWP 17474)]
hello I'm thread 3 with pthread_id 140737320556288
[New Thread 0x7ffff57f8700 (LWP 17475)]
hello I'm thread 4 with pthread_id 140737312163584
[Thread 0x7ffff6ffc700 (LWP 17472) exited]
[Thread 0x7ffff77fd700 (LWP 17471) exited]
[Thread 0x7ffff67fa700 (LWP 17473) exited]
[Thread 0x7ffff57f8700 (LWP 17475) exited]
count = 141335712
[Thread 0x7ffff5ff9700 (LWP 17474) exited]
[Inferior 1 (process 17451) exited normally]


(gdb) break worker_loop
(gdb) run 3

(gdb) break 76     # sets the breakpoint for every thread 

Breakpoint 2, worker_loop (arg=0x602030) at racecond.c:76
76	      count += i; 

(gdb) info threads  (the star'ed one is active)
  Id   Target Id         Frame 
  4    Thread 0x7ffff67fa700 (LWP 17587) "racecond" worker_loop (arg=0x602038)
    at racecond.c:68
  3    Thread 0x7ffff6ffc700 (LWP 17549) "racecond" __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
* 2    Thread 0x7ffff77fd700 (LWP 17548) "racecond" worker_loop (arg=0x602030)
    at racecond.c:76
  1    Thread 0x7ffff7fcd700 (LWP 17539) "racecond" 0x00007ffff7bc6148 in 
    pthread_join (threadid=140737345738496, thread_return=0x0) at pthread_join.c:89

# thread 2 is the current thread, where will show thread 2's stack trace:
(gdb) where
#0  worker_loop (arg=0x602030) at racecond.c:76
#1  0x00007ffff7bc4e9a in start_thread (arg=0x7ffff77fd700)
    at pthread_create.c:308
#2  0x00007ffff78f1dbd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#3  0x0000000000000000 in ?? ()

# switch to thread three's context
(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffff6ffc700 (LWP 17549))]
#0  __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
132	../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.

# get thread 3's stack trace
(gdb) where
#0  __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
#1  0x00007ffff7bc7065 in _L_lock_858 ()
   from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007ffff7bc6eba in __pthread_mutex_lock (mutex=0x6010c0)
    at pthread_mutex_lock.c:61
#3  0x0000000000400aa2 in worker_loop (arg=0x602034) at racecond.c:75
#4  0x00007ffff7bc4e9a in start_thread (arg=0x7ffff6ffc700)
    at pthread_create.c:308
#5  0x00007ffff78f1dbd in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()

# move into stack frame 3 of thread 3
(gdb) frame 3
#3  0x0000000000400aa2 in worker_loop (arg=0x602034) at racecond.c:75
75	      pthread_mutex_lock(&my_mutex);

(gdb) print my_mutex