[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZLk88+EYaWeXA3Gm@kernel.org>
Date: Thu, 20 Jul 2023 10:56:03 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Namhyung Kim <namhyung@...nel.org>, Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Clark Williams <williams@...hat.com>,
Kate Carcia <kcarcia@...hat.com>, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org,
Masami Hiramatsu <mhiramat@...nel.org>
Subject: Re: [PATCHES/RFC 1/5] perf bench uprobe + BPF skel
Em Wed, Jul 19, 2023 at 03:41:54PM -0700, Ian Rogers escreveu:
> On Wed, Jul 19, 2023 at 1:49 PM Arnaldo Carvalho de Melo
> <acme@...nel.org> wrote:
> >
> > Hi,
> >
> > This adds a 'perf bench' to test the overhead of uprobes + BPF
> > programs, for now just a few simple tests, but I plan to make it
> > possible to specify the functions to attach the uprobe + BPF, other BPF
> > operations dealing with maps, etc.
> >
> > This is how it looks like now:
> >
> > [root@...e ~]# perf bench uprobe all
> > # Running uprobe/baseline benchmark...
> > # Executed 1,000 usleep(1000) calls
> > Total time: 1,053,963 usecs
> >
> > 1,053.963 usecs/op
> >
> > # Running uprobe/empty benchmark...
> > # Executed 1,000 usleep(1000) calls
> > Total time: 1,056,293 usecs +2,330 to baseline
> >
> > 1,056.293 usecs/op 2.330 usecs/op to baseline
> >
> > # Running uprobe/trace_printk benchmark...
> > # Executed 1,000 usleep(1000) calls
> > Total time: 1,056,977 usecs +3,014 to baseline +684 to previous
> >
> > 1,056.977 usecs/op 3.014 usecs/op to baseline 0.684 usecs/op to previous
> >
> > [root@...e ~]
> >
> > I put it here:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf-bench-uprobe
> >
> > git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf-bench-uprobe
> >
> > Further ideas, problems?
>
> No problems. Perhaps it would be interesting to measure the uprobe
> overhead compared to say the overhead attaching to the nanosleep
> syscall?
Can you rephrase your question?
The test is comparing the overhead attaching to the clock_nanosleep
syscall:
[root@...e ~]# strace -c ~/bin/perf bench uprobe baseline
# Running 'uprobe/baseline' benchmark:
# Executed 1,000 usleep(1000) calls
Total time: 1,077,139 usecs
1,077.139 usecs/op
==7056==LeakSanitizer has encountered a fatal error.
==7056==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
==7056==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ------------------
52.87 0.002973 2 1000 clock_nanosleep
22.55 0.001268 3 370 mmap
8.87 0.000499 4 106 read
5.42 0.000305 4 62 munmap
2.42 0.000136 3 38 openat
1.69 0.000095 1 48 mprotect
1.28 0.000072 1 57 close
1.19 0.000067 3 18 open
0.98 0.000055 1 40 1 newfstatat
0.44 0.000025 0 30 pread64
0.44 0.000025 6 4 getdents64
0.32 0.000018 18 1 readlink
0.28 0.000016 2 8 write
0.23 0.000013 1 9 4 prctl
0.21 0.000012 6 2 2 access
0.12 0.000007 0 8 madvise
0.11 0.000006 1 4 clock_gettime
0.11 0.000006 1 4 prlimit64
0.07 0.000004 1 3 rt_sigaction
0.07 0.000004 1 4 sigaltstack
0.07 0.000004 4 1 sched_getaffinity
0.05 0.000003 0 6 getpid
0.04 0.000002 0 3 rt_sigprocmask
0.04 0.000002 1 2 1 arch_prctl
0.04 0.000002 1 2 futex
0.04 0.000002 2 1 set_robust_list
0.02 0.000001 1 1 set_tid_address
0.02 0.000001 1 1 rseq
0.00 0.000000 0 1 brk
0.00 0.000000 0 14 sched_yield
0.00 0.000000 0 1 clone
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 wait4
0.00 0.000000 0 1 gettid
------ ----------- ----------- --------- --------- ------------------
100.00 0.005623 3 1852 8 total
[root@...e ~]#
Powered by blists - more mailing lists