lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzacLW14OmuW1nV4s=cMcSqcrSAO2Y_0XY3jT2efGNfmuw@mail.gmail.com>
Date: Wed, 23 Oct 2024 10:38:54 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Peter Ziljstra <peterz@...radead.org>, Will Deacon <will@...nel.org>, 
	Catalin Marinas <catalin.marinas@....com>, Mark Rutland <mark.rutland@....com>
Cc: Linux trace kernel <linux-trace-kernel@...r.kernel.org>, bpf <bpf@...r.kernel.org>, 
	Jiri Olsa <jolsa@...nel.org>, Oleg Nesterov <oleg@...hat.com>, 
	Masami Hiramatsu <mhiramat@...nel.org>, Liao Chang <liaochang1@...wei.com>, 
	linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>, 
	open list <linux-kernel@...r.kernel.org>, 
	"linux-perf-use." <linux-perf-users@...r.kernel.org>, Kernel Team <kernel-team@...a.com>
Subject: Re: The state of uprobes work and logistics

Ok, 7 days has passed, let's see how we are doing here...

On Wed, Oct 16, 2024 at 12:35 PM Andrii Nakryiko
<andrii.nakryiko@...il.com> wrote:
>
> Hello,
>
> I wanted to provide a bit of a context about and tie together a few
> separate work streams (across a few separate kernel trees) all
> revolving around uprobe improvements, as there are a bunch of them and
> I'm sure it's hard to keep track of all of them. And hopefully I can
> also get Peter and ARM maintainer's input on some specific questions I
> asked below. Thank you in advance!
>
> In short, in the last few months there was a high activity around
> fixing and improving uprobes. All this is the result of increased and
> more varied use of uprobes/uretprobe in production settings. Uprobe
> performance is **very** important, and yes, we do have real use cases
> that go to millions per second uprobe/uretprobe triggering throughput,
> unfortunately. So any small bit of performance and scalability
> improvement is helpful. No, this isn't just some nerdy perf
> optimization work (I've been asked this a few times, so I thought I'd
> emphasize this again).
>
> So, we've already landed a bunch of work, mainly (not an exhaustive list):
>
>   - various clean ups, API improvements, and bug fixes from Oleg
> Nesterov ([0], [1]). This simplified internal APIs and was a
> prerequisite of the rest of the work;
>   - changes to refcounting and RCU-ifying of uprobe lifetime from me
> ([2]). This improved single-threaded performance somewhat, but mainly
> significantly improved scalability in the presence of multiple CPUs
> triggering lots of uprobes;
>   - ARM64-specific optimization of uprobe emulation of NOP instruction
> by Liao Chang ([3]). This change alone gives 2x (!) speed up for a
> USDT tracing use cases *on ARM64* (we already have this optimization
> in x86-64);
>   - there was a bit earlier work by Jiri Olsa ([4]) to add uretprobe()
> syscall, giving +30% speed ups.
>
> And there are a few more outstanding changes:
>
>   - Jiri Olsa's uprobe "session" support ([5]). This is less
> performance focused, but important functionality by itself. But I'm
> calling this out here because the first two patches are pure uprobe
> internal changes, and I believe they should go into tip/perf/core to
> avoid conflicts with the rest of pending uprobe changes.
>
> Peter, do you mind applying those two and creating a stable tag for
> bpf-next to pull? We'll apply the rest of Jiri's series to
> bpf-next/master.

Jiri has reposted patches this time CC'ing Peter, heh :), it would be
great to apply those two patches and get a stable tag. This is
blocking the landing of uprobe sessions in bpf-next and also my
remaining patches will be based on top of Jiri's uprobe changes, most
probably. Peter, please take another look, thank you.

>
>   - Liao Chang's ARM64-specific STP instruction emulation support
> ([6]). This one will give 2x (!) improvement for a common case of
> having STP instruction being a first instruction in traced user
> function (similar to NOP for USDTs).
>
> ARM64 maintainers (cc'ed Catalin, Will, and Mark), can you guys please
> take another look? This one was a bit more controversial, but
> hopefully there is a way to massage it to be acceptable and not
> introduce unnecessary slowdowns (there were some concerns about memory
> ordering/visibility, which hopefully don't apply to uprobe cases).
> It's an important improvement, I'd really appreciate it if we can make
> progress here, thank you!
>

Ping. ARM64 folks, can you please take a look and reply? Thank you.

>   - my speculative VMA-to-uprobe lookup series ([7]). This makes entry
> uprobe scalability scale linearly with the number of CPUs (the
> ultimate goal of uprobe scalability work).
>
> I think it's ready to go in. It has **implicit** dependency on
> Christian Brauner's recent change for FMODE_BACKING, for which he
> provided a stable tag. Peter, do you have any remaining concerns or
> this can be also merged soon?

No changes, still ready to go in. Might need a rebase if Jiri's
patches are applied.

>
>   - another patch set of mine, switching uretprobe fast path to SRCU
> (with timeout) ([8]). This makes return uprobes (uretprobes) linearly
> scalable in the common case (again, the ultimate scalability goal).
>
> I haven't gotten much feedback here, would love to get some objective
> review here. This is an important counterpart to the speculative
> VMA-to-uprobe lookup series. Both are needed in practice.
>

The only thing that has progressed, thank you. I'll apply suggested
state changes, but I intend to postpone delayed_uprobe_lock rework to
a separate follow up patch set. Just a heads up.

>   - patch set dropping unnecessary siglock usage in uprobe by Liao
> Chang ([9]). This one removes yet another lock, for a less common case
> (at least on x86-64) of single-stepped uprobe (where the probed
> instruction can't be emulated).
>
> This one needs a rebase, but it was already acked by Oleg. Liao,
> please prioritize the rebase and send v4 ASAP, so this is not lost.
>

This was rebased and acked by Masami. Seems to be ready to be applied.

>
> As you can see, lots of stuff needs to be landed and most of it is in
> good shape already. I'd love to hear thoughts of relevant people
> called out above, thank you!
>
>
>   [0] https://lore.kernel.org/linux-trace-kernel/20240729134444.GA12293@redhat.com/
>   [1] https://lore.kernel.org/linux-trace-kernel/20240929144201.GA9429@redhat.com/
>   [2] https://lore.kernel.org/linux-trace-kernel/20240903174603.3554182-1-andrii@kernel.org/
>   [3] https://lore.kernel.org/linux-trace-kernel/20240909071114.1150053-1-liaochang1@huawei.com/
>   [4] https://lore.kernel.org/linux-trace-kernel/20240523121149.575616-1-jolsa@kernel.org/
>   [5] https://lore.kernel.org/bpf/20241015091050.3731669-1-jolsa@kernel.org/
>   [6] https://lore.kernel.org/linux-trace-kernel/20240910060407.1427716-1-liaochang1@huawei.com/
>   [7] https://lore.kernel.org/linux-trace-kernel/20241010205644.3831427-1-andrii@kernel.org/
>   [8] https://lore.kernel.org/linux-trace-kernel/20241008002556.2332835-1-andrii@kernel.org/
>   [9] https://lore.kernel.org/linux-trace-kernel/20240815014629.2685155-1-liaochang1@huawei.com/
>
> -- Andrii

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ