Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome, Safari or Firefox browser.

uprobes, kprobes SystemTap tracepoint

tracing user space programs

swiss army knife

debugging

when gdb is not enough?

Example program

#include <stdio.h>

int count = 0;

void f() {
    count++;
}

int main() {
    for (int i=0; i<10000; i++) f();
    return count;
}

Counting events with gdb ...

[maciek@pc loop]$ gdb loop 
GNU gdb (GDB) Fedora 7.9.1-19.fc22
Reading symbols from loop...done.
(gdb) set $i=0
(gdb) b f
Breakpoint 1 at 0x40058a: file loop.c, line 10.
(gdb) comm
>silent
>set $i=$i+1
>c
>end
(gdb) b loop.c:11
Breakpoint 2 at 0x4005e9: file loop.c, line 11.
(gdb) comm
>silent
>print $i
>c
>end
(gdb) r
$1 = 10000
[Inferior 1 (process 4576) exited with code 020]

It works but ...

80us no debugger 3s gdb 560ms gdb tracepoint 12ms stap 5ms stap sdt

Why so slow?

Attach to process

Place a breakpoint

Breakpoint hit

Get registers

Remove breakpoint

Execute commands

Restore breakpoint

Overall cost

Any alternatives?

Adding tracing code

systemtap

tracing

Counting events with stap ...

global i

probe process("loop").function("f") {i++}

probe process("loop").function("main").return {
  printf("%d vs %d\n", i, $return);
  delete i;
}
[maciek@pc loop]$ sudo stap loop.stp -c ./loop
10000 vs 10000

How does stap work?

stp script parsing

translate and compile

probe placement

connect STDOUT

probe point hit

Systemtap benefits

Examples

Measure execution time

response process_path_request(const request& req) {
    json j = json::parse(req.data);

    auto arr = j["path"];
    if (!arr.is_array()) {
       return {400, "application/javascript", "{error:'invalid data'}"};
    }

    std::vector<point> points;
    for (auto& el : arr) {
        if (!el.is_array() || el.size() != 2) {
          return {400, "application/javascript", "{error:'invalid data'}"};
        }
        double lat = el[0], lon = el[1];
        points.emplace_back(spat2xy(lat, lon, ZOOM));
    }

    std::ostringstream bytes;
    cairo_render_path(points, bytes);
    return {200, "image/png", bytes.str()};
}
matches specific line in code
time since last timestamp
global last_timestamp;

probe process("gis_server").function("process_path_request") {
    last_timestamp = gettimeofday_us();
}

probe process("gis_server").statement("*@gis_server.cpp:72") {
    d = gettimeofday_us() - last_timestamp;
    last_timestamp = gettimeofday_us();
    printf("1: %d\n", d);
}

probe process("gis_server").statement("*@gis_server.cpp:86") {
    d = gettimeofday_us() - last_timestamp;
    last_timestamp = gettimeofday_us();
    printf("2: %d\n", d);
}

probe process("gis_server").function("process_path_request").return {
    d = gettimeofday_us() - last_timestamp;
    last_timestamp = gettimeofday_us();
    printf("3: %d\n", d);
}
[maciek@pc exec_time_server]$ sudo stap time.stp -c ./gis_server  100
1: 390
2: 550
3: 31750

Result data (more samples)

Finding call time longer then usual ...

every function if source file
at least X samples
Y% above normal
global times
global mincount = 100      # training data for 100 samples
global note_percent = 250  # report above 250% change

function check(t) {
   if (@count(times[ppfunc()]) >= mincount &&
       t >= @max(times[ppfunc()]) * note_percent / 100)
   {
     printf("function %s well over max time (%d vs %d)\n",
            ppfunc(), t, @max(times[ppfunc()]))
   }

   times[ppfunc()] <<< t
}

probe process("gis_server").function("*@gis_server.cpp").return {
   check(gettimeofday_us() - @entry(gettimeofday_us()))
}
[maciek@pc server]$ sudo stap fntimes.stp &
[maciek@pc server]$ for i in {1..110}; do ./gis_server $i; done
[maciek@pc server]$ ./gis_server 10000
function cairo_render_path well over max time (1162531 vs 49436)
function process_path_request well over max time (1277363 vs 51414)
function main well over max time (1359476 vs 52736)
[maciek@pc server]$

Tracing function calls

Call graph

matches all functions in process
change indent on entry to and on return from function
probe process("ls").function("*").call {
    printf("%s -> %s\n", thread_indent(1), probefunc())
}

probe process("ls").function("*").return {
    printf("%s <- %s\n", thread_indent(-1), probefunc())
}
[maciek@pc call_graph]$ sudo stap ls_call_graph.stp -c ls
     0 ls(2430): -> _start
    25 ls(2430):  -> __libc_csu_init
    31 ls(2430):   -> _init
    36 ls(2430):   <- __libc_csu_init
    40 ls(2430):   -> frame_dummy
    45 ls(2430):    -> register_tm_clones
    49 ls(2430):    <- __libc_csu_init
    52 ls(2430):   <- __libc_csu_init
    55 ls(2430):  <- 0x3073e2068f
    59 ls(2430):  -> main
    63 ls(2430):   -> set_program_name
    71 ls(2430):   <- main
    85 ls(2430):   -> atexit
    90 ls(2430):   <- main
   106 ls(2430):   -> human_options
   ...

Debugging state machine

int main() {
    ManProcessor c;

    std::string line;
    while (std::getline(std::cin, line)) {
        c.process(line.c_str());
    }
}
class ManProcessor {
    std::unique_ptr<ManProcessorState> state;

public:
    ManProcessor() : state(new IdleState()) {
    }

    void process(const char* line) {
        state->process(*this, line);
    }

    void setState(ManProcessorState* state) {
        this->state.reset(state);
    }
};
// ...

void ChapterState::process(ManProcessor& context, const char* line) {
    if (std::regex_match(line, std::regex("^[ \t]+[-].*")))
        context.setState(new OptionState());
    else if (std::regex_match(line, std::regex("^.+$")))
        context.setState(new ParagraphState());
}

// ...

Trace state changes

Function name from previous stack frame
probe process("man_state").function("ManProcessor::setState") {
    printf("| %s\n", usymname(ustack(1)));
}

Running the example ...

[maciek@pc state_machine]$ sudo -E stap state.stp  | c++filt &
[3] 31614
[maciek@pc state_machine]$ man ls | head -n 20 | ./man_state
| IdleState::process(ManProcessor&, char const*)
| ChapterState::process(ManProcessor&, char const*)
| ParagraphState::process(ManProcessor&, char const*)
| ChapterState::process(ManProcessor&, char const*)
| ParagraphState::process(ManProcessor&, char const*)
| ChapterState::process(ManProcessor&, char const*)
| ParagraphState::process(ManProcessor&, char const*)
| ChapterState::process(ManProcessor&, char const*)
| ParagraphState::process(ManProcessor&, char const*)
| ChapterState::process(ManProcessor&, char const*)
| OptionState::process(ManProcessor&, char const*)
| ChapterState::process(ManProcessor&, char const*)

Trace input data

captures functions matching pattern
user space function argument
probe process("man_state").function("*State::process") {
    printf(". %s\n", user_string($line));
}

Running the example ...

[maciek@pc state_machine]$ sudo -E stap state.stp  | c++filt &
[3] 31614
[maciek@pc state_machine]$ man ls | head -n 20 | ./man_state
. LS(1)
. 
. NAME
| IdleState::process(ManProcessor&, char const*)
.        ls - list directory contents
| ChapterState::process(ManProcessor&, char const*)
. 
| ParagraphState::process(ManProcessor&, char const*)
. SYNOPSIS
| ChapterState::process(ManProcessor&, char const*)
.        ls [OPTION]... [FILE]...

Who is stealing my core?

global cores, core0, core1, core2, core3, names, tasks

probe timer.profile {
  tid = tid();
  tasks[tid] = execname();
  names[tid] = pid2execname(pid());

  if (cpu() == 0) core0[tid]++;
  if (cpu() == 1) core1[tid]++;
  if (cpu() == 2) core2[tid]++;
  if (cpu() == 3) core3[tid]++;
  cores[tid]++;
}
probe end {
  printf("\n %5s %20s %20s %8s %8s %8s %8s\n", "TID", "process",
         "task", "core0", "core1", "core2", "core3")

  foreach(t in names) {
    sum = cores[t]
    printf(" %5d %20s %20s, %8s %8s %8s %8s\n", t, names[t], tasks[t],
      percent(core0[t], sum), percent(core1[t], sum),
      percent(core2[t], sum), percent(core3[t], sum)
    );
  }

  printf("\n");
}
[maciek@pc cpu_core_usage]$ sudo stap proc_cpu4_thread.stp 
^C
   TID       process             task   core0    core1    core2    core3
     0                      swapper/1,  41.34%   12.61%     5.5%   40.98%
 31610        stapio           stapio,  100.0%     0.0%     0.0%     0.0%
    73       kswapd0          kswapd0,    0.0%     0.0%   100.0%     0.0%
 26953        stapio           stapio,    0.0%     0.0%   100.0%     0.0%
  2973          Xorg             Xorg,    0.0%   83.33%   16.66%     0.0%
 18766   gnome-shell      gnome-shell,    0.0%     0.0%    6.55%   93.44%
 18871   gnome-shell            gdbus,    0.0%   100.0%     0.0%     0.0%
 17630       firefox       DOM Worker,  28.57%     0.0%   71.42%     0.0%
 17605       firefox        JS Helper,  100.0%     0.0%     0.0%     0.0%
  4305      ruby-mri     utils.rb:162,  100.0%     0.0%     0.0%     0.0%
    15   ksoftirqd/1      ksoftirqd/1,    0.0%   100.0%     0.0%     0.0%
 17640       firefox       DOM Worker,    0.0%     0.0%     0.0%   100.0%
     7     rcu_sched        rcu_sched,    0.0%    50.0%     0.0%    50.0%
 24063 kworker/u16:0    kworker/u16:0,    0.0%   33.33%   66.66%     0.0%
     9       rcuos/0          rcuos/0,    0.0%     0.0%   100.0%     0.0%
     1       systemd          systemd,    0.0%     0.0%   100.0%     0.0%
   751 systemd-login  systemd-logind,     0.0%     0.0%     0.0%   100.0%
 23764   kworker/2:3      kworker/2:3,    0.0%     0.0%   100.0%     0.0%
   757    irqbalance       irqbalance,    0.0%     0.0%   100.0%     0.0%

Debug locks contention

probe process("/usr/lib64/libpthread.so.0").function("pthread_mutex_init").return {
    uaddr = &$mutex->__data->__lock
    lock_names[uaddr] = usymfileline(ustack(1));
}

probe syscall.futex.return {
  if (target() != pid()) next;
  if (($op & ~(FUTEX_PRIVATE_FLAG|FUTEX_CLOCK_REALTIME)) != FUTEX_WAIT) next
  elapsed = gettimeofday_us() - @entry(gettimeofday_us())
  lock_waits[$uaddr]  <<< elapsed
}

probe end {
  foreach (lock in lock_names) {
    printf ("%s[%x] lock contended %d times, %d avg us\n",
            lock_names[lock], lock, @count(lock_waits[lock]),
            @avg(lock_waits[lock]))

    print(@hist_log(lock_waits[lock]));
  }
}
[maciek@pc futex2]$ sudo stap -v futex.stp -d ./test1 -c ./test1
test1.cpp:8[6124c0] lock contended 95 times, 37467 avg us
 value |-------------------------------------------------- count
   256 |                                                    0
   512 |@                                                   1
  1024 |                                                    0
  2048 |@@                                                  2
  4096 |@@@@@@@                                             7
  8192 |@@@@@@@@@@@@@@                                     14
 16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                   32
 32768 |@@@@@@@@@@@@@@@                                    15
 65536 |@@@@@@@@@@@@@@@@@@@@@@@@                           24
131072 |                                                    0

net_socket.cpp:4[7ffff0014aa0] lock contended 4 times, 1512 avg us
value |-------------------------------------------------- count
  256 |                                                   0
  512 |@@                                                 2
 1024 |@                                                  1
 2048 |@                                                  1
 4096 |                                                   0

voice_manager.cpp:4[7ffff0014ad0] lock contended 93 times, 56146 avg us
...

Probes handlers in C

Guru mode

function test:long(addr:long) %{
  char buffer[100];
  long a = strncpy_from_user(buffer, (char*)STAP_ARG_addr, 100);
  char *substr = strnstr(buffer, "stap sucks", 100);
  if (substr != NULL) {
    long b = copy_to_user(
       ((char*)STAP_ARG_addr)+(substr+5-&buffer[0]),"rocks", 5);
  }
%}

probe syscall.write {
  if (isinstr(user_string($buf), "stap sucks")) {
    test($buf);
  }
}
[maciek@pc sys_write]$ echo "I think that stap sucks"
I think that stap rocks
[maciek@pc sys_write]$ history | tail -2
 1123  echo "I think that stap rocks"
 1124  history | tail -2
[maciek@pc sys_write]$ vim tmp
stap sucks
~ 
~
~
~
~
:wq
[maciek@pc sys_write]$ cat tmp
stap rocks

Thank you

Use a spacebar or arrow keys to navigate