[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220330212402.GA2719@kbox>
Date: Wed, 30 Mar 2022 14:24:02 -0700
From: Beau Belgrave <beaub@...ux.microsoft.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Song Liu <song@...nel.org>, rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
linux-trace-devel <linux-trace-devel@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
bpf <bpf@...r.kernel.org>, netdev <netdev@...r.kernel.org>,
linux-arch <linux-arch@...r.kernel.org>
Subject: Re: [PATCH] tracing/user_events: Add eBPF interface for user_event
created events
On Wed, Mar 30, 2022 at 03:57:26PM -0400, Mathieu Desnoyers wrote:
> ----- On Mar 30, 2022, at 3:15 PM, Beau Belgrave beaub@...ux.microsoft.com wrote:
>
> > On Wed, Mar 30, 2022 at 11:22:32AM -0700, Alexei Starovoitov wrote:
> >> On Wed, Mar 30, 2022 at 9:34 AM Beau Belgrave <beaub@...ux.microsoft.com> wrote:
> >> > > >
> >> > > > But you are fine with uprobe costs? uprobes appear to be much more costly
> >> > > > than a syscall approach on the hardware I've run on.
> >>
> >> Care to share the numbers?
> >> uprobe over USDT is a single trap.
> >> Not much slower compared to syscall with kpti.
> >>
> >
> > Sure, these are the numbers we have from a production device.
> >
> > They are captured via perf via PERF_COUNT_HW_CPU_CYCLES.
> > It's running a 20K loop emitting 4 bytes of data out.
> > Each 4 byte event time is recorded via perf.
> > At the end we have the total time and the max seen.
> >
> > null numbers represent a 20K loop with just perf start/stop ioctl costs.
> >
> > null: min=2863, avg=2953, max=30815
> > uprobe: min=10994, avg=11376, max=146682
> > uevent: min=7043, avg=7320, max=95396
> > lttng: min=6270, avg=6508, max=41951
> >
> > These costs include the data getting into a buffer, so they represent
> > what we would see in production vs the trap cost alone. For uprobe this
> > means we created a uprobe and attached it via tracefs to get the above
> > numbers.
>
> [...]
>
> I assume here that by "lttng" you specifically refer to lttng-ust (LTTng's
> user-space tracer), am I correct ?
>
Yes, this is ust.
> By removing the "null" baseline overhead, my rough calculations are that the
> average overhead for lttng-ust in your results is (in cpu cycles):
>
> 6270-2863 = 3555
>
> So I'm unsure what is the frequency of your CPU, but guessing around 3.5GHz
> this is in the area of 1 microsecond. On an Intel CPU, this is much larger
> than what I would expect.
>
> Can you share your test program, hardware characteristics, kernel version,
> glibc version, and whether the program is compiled as a 32-bit or 64-bit
> binary ?
>
This is how we are measuring:
#define PERF_START() \
do \
{ \
ioctl(perf_fd, PERF_EVENT_IOC_RESET, 0); \
ioctl(perf_fd, PERF_EVENT_IOC_ENABLE, 0); \
} while (0)
#define PERF_END(__cycles) \
do \
{ \
ioctl(perf_fd, PERF_EVENT_IOC_DISABLE, 0); \
read(perf_fd, &__cycles, sizeof(__cycles)); \
} while (0)
struct perf_event_attr pe;
long long min, max, total;
int i, perf_fd;
memset(&pe, 0, sizeof(pe));
pe.type = PERF_TYPE_HARDWARE;
pe.size = sizeof(pe);
pe.config = PERF_COUNT_HW_CPU_CYCLES;
pe.disabled = 1;
pe.exclude_hv = 1;
perf_fd = perf_event_open(&pe, 0, -1, -1, 0);
min = max = total = 0;
for (i = 0; i < 20000; ++i)
{
long long cycles;
PERF_START();
probe();
PERF_END(cycles);
if (i == 0 || cycles < min)
min = cycles;
if (cycles > max)
max = cycles;
total += cycles;
}
probe() here could be a call to writev or to the lttng trace call.
> Can you confirm that lttng-ust is not calling one getcpu system call per
> event ? This might be the case if run a 32-bit x86 binary and have a
> glibc < 2.35, or a kernel too old to provide CONFIG_RSEQ or don't have
> CONFIG_RSEQ=y in your kernel configuration. You can validate this by
> running your lttng-ust test program with a system call tracer.
>
We don't have CONFIG_RSEQ, so that is likely the cause. LTTng is always
going to be the fastest thing out there. It's pure user mode :)
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
Thanks,
-Beau
Powered by blists - more mailing lists