[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzacLW14OmuW1nV4s=cMcSqcrSAO2Y_0XY3jT2efGNfmuw@mail.gmail.com>
Date: Wed, 23 Oct 2024 10:38:54 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Peter Ziljstra <peterz@...radead.org>, Will Deacon <will@...nel.org>,
Catalin Marinas <catalin.marinas@....com>, Mark Rutland <mark.rutland@....com>
Cc: Linux trace kernel <linux-trace-kernel@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
Jiri Olsa <jolsa@...nel.org>, Oleg Nesterov <oleg@...hat.com>,
Masami Hiramatsu <mhiramat@...nel.org>, Liao Chang <liaochang1@...wei.com>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
open list <linux-kernel@...r.kernel.org>,
"linux-perf-use." <linux-perf-users@...r.kernel.org>, Kernel Team <kernel-team@...a.com>
Subject: Re: The state of uprobes work and logistics
Ok, 7 days has passed, let's see how we are doing here...
On Wed, Oct 16, 2024 at 12:35 PM Andrii Nakryiko
<andrii.nakryiko@...il.com> wrote:
>
> Hello,
>
> I wanted to provide a bit of a context about and tie together a few
> separate work streams (across a few separate kernel trees) all
> revolving around uprobe improvements, as there are a bunch of them and
> I'm sure it's hard to keep track of all of them. And hopefully I can
> also get Peter and ARM maintainer's input on some specific questions I
> asked below. Thank you in advance!
>
> In short, in the last few months there was a high activity around
> fixing and improving uprobes. All this is the result of increased and
> more varied use of uprobes/uretprobe in production settings. Uprobe
> performance is **very** important, and yes, we do have real use cases
> that go to millions per second uprobe/uretprobe triggering throughput,
> unfortunately. So any small bit of performance and scalability
> improvement is helpful. No, this isn't just some nerdy perf
> optimization work (I've been asked this a few times, so I thought I'd
> emphasize this again).
>
> So, we've already landed a bunch of work, mainly (not an exhaustive list):
>
> - various clean ups, API improvements, and bug fixes from Oleg
> Nesterov ([0], [1]). This simplified internal APIs and was a
> prerequisite of the rest of the work;
> - changes to refcounting and RCU-ifying of uprobe lifetime from me
> ([2]). This improved single-threaded performance somewhat, but mainly
> significantly improved scalability in the presence of multiple CPUs
> triggering lots of uprobes;
> - ARM64-specific optimization of uprobe emulation of NOP instruction
> by Liao Chang ([3]). This change alone gives 2x (!) speed up for a
> USDT tracing use cases *on ARM64* (we already have this optimization
> in x86-64);
> - there was a bit earlier work by Jiri Olsa ([4]) to add uretprobe()
> syscall, giving +30% speed ups.
>
> And there are a few more outstanding changes:
>
> - Jiri Olsa's uprobe "session" support ([5]). This is less
> performance focused, but important functionality by itself. But I'm
> calling this out here because the first two patches are pure uprobe
> internal changes, and I believe they should go into tip/perf/core to
> avoid conflicts with the rest of pending uprobe changes.
>
> Peter, do you mind applying those two and creating a stable tag for
> bpf-next to pull? We'll apply the rest of Jiri's series to
> bpf-next/master.
Jiri has reposted patches this time CC'ing Peter, heh :), it would be
great to apply those two patches and get a stable tag. This is
blocking the landing of uprobe sessions in bpf-next and also my
remaining patches will be based on top of Jiri's uprobe changes, most
probably. Peter, please take another look, thank you.
>
> - Liao Chang's ARM64-specific STP instruction emulation support
> ([6]). This one will give 2x (!) improvement for a common case of
> having STP instruction being a first instruction in traced user
> function (similar to NOP for USDTs).
>
> ARM64 maintainers (cc'ed Catalin, Will, and Mark), can you guys please
> take another look? This one was a bit more controversial, but
> hopefully there is a way to massage it to be acceptable and not
> introduce unnecessary slowdowns (there were some concerns about memory
> ordering/visibility, which hopefully don't apply to uprobe cases).
> It's an important improvement, I'd really appreciate it if we can make
> progress here, thank you!
>
Ping. ARM64 folks, can you please take a look and reply? Thank you.
> - my speculative VMA-to-uprobe lookup series ([7]). This makes entry
> uprobe scalability scale linearly with the number of CPUs (the
> ultimate goal of uprobe scalability work).
>
> I think it's ready to go in. It has **implicit** dependency on
> Christian Brauner's recent change for FMODE_BACKING, for which he
> provided a stable tag. Peter, do you have any remaining concerns or
> this can be also merged soon?
No changes, still ready to go in. Might need a rebase if Jiri's
patches are applied.
>
> - another patch set of mine, switching uretprobe fast path to SRCU
> (with timeout) ([8]). This makes return uprobes (uretprobes) linearly
> scalable in the common case (again, the ultimate scalability goal).
>
> I haven't gotten much feedback here, would love to get some objective
> review here. This is an important counterpart to the speculative
> VMA-to-uprobe lookup series. Both are needed in practice.
>
The only thing that has progressed, thank you. I'll apply suggested
state changes, but I intend to postpone delayed_uprobe_lock rework to
a separate follow up patch set. Just a heads up.
> - patch set dropping unnecessary siglock usage in uprobe by Liao
> Chang ([9]). This one removes yet another lock, for a less common case
> (at least on x86-64) of single-stepped uprobe (where the probed
> instruction can't be emulated).
>
> This one needs a rebase, but it was already acked by Oleg. Liao,
> please prioritize the rebase and send v4 ASAP, so this is not lost.
>
This was rebased and acked by Masami. Seems to be ready to be applied.
>
> As you can see, lots of stuff needs to be landed and most of it is in
> good shape already. I'd love to hear thoughts of relevant people
> called out above, thank you!
>
>
> [0] https://lore.kernel.org/linux-trace-kernel/20240729134444.GA12293@redhat.com/
> [1] https://lore.kernel.org/linux-trace-kernel/20240929144201.GA9429@redhat.com/
> [2] https://lore.kernel.org/linux-trace-kernel/20240903174603.3554182-1-andrii@kernel.org/
> [3] https://lore.kernel.org/linux-trace-kernel/20240909071114.1150053-1-liaochang1@huawei.com/
> [4] https://lore.kernel.org/linux-trace-kernel/20240523121149.575616-1-jolsa@kernel.org/
> [5] https://lore.kernel.org/bpf/20241015091050.3731669-1-jolsa@kernel.org/
> [6] https://lore.kernel.org/linux-trace-kernel/20240910060407.1427716-1-liaochang1@huawei.com/
> [7] https://lore.kernel.org/linux-trace-kernel/20241010205644.3831427-1-andrii@kernel.org/
> [8] https://lore.kernel.org/linux-trace-kernel/20241008002556.2332835-1-andrii@kernel.org/
> [9] https://lore.kernel.org/linux-trace-kernel/20240815014629.2685155-1-liaochang1@huawei.com/
>
> -- Andrii
Powered by blists - more mailing lists