[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240704091559.GS11386@noisy.programming.kicks-ass.net>
Date: Thu, 4 Jul 2024 11:15:59 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: Andrii Nakryiko <andrii@...nel.org>, linux-trace-kernel@...r.kernel.org,
rostedt@...dmis.org, mhiramat@...nel.org, oleg@...hat.com,
mingo@...hat.com, bpf@...r.kernel.org, jolsa@...nel.org,
paulmck@...nel.org, clm@...a.com,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 00/12] uprobes: add batched register/unregister APIs
and per-CPU RW semaphore
On Wed, Jul 03, 2024 at 02:33:06PM -0700, Andrii Nakryiko wrote:
> 2. More tactically, RCU protection seems like the best way forward. We
> got hung up on SRCU vs RCU Tasks Trace. Thanks to Paul, we also
> clarified that RCU Tasks Trace has nothing to do with Tasks Rude
> flavor (whatever that is, I have no idea).
>
> Now, RCU Tasks Trace were specifically designed for least overhead
> hotpath (reader side) performance, at the expense of slowing down much
> rarer writers. My microbenchmarking does show at least 5% difference.
> Both flavors can handle sleepable uprobes waiting for page faults.
> Tasks Trace flavor is already used for tracing in the BPF realm,
> including for sleepable uprobes and works well. It's not going away.
I need to look into this new RCU flavour and why it exists -- for
example, why can't SRCU be improved to gain the same benefits. This is
what we've always done, improve SRCU.
> Now, you keep pushing for SRCU instead of RCU Tasks Trace, but I
> haven't seen a single argument why. Please provide that, or let's
> stick to RCU Tasks Trace, because uprobe's use case is an ideal case
> of what Tasks Trace flavor was designed for.
Because I actually know SRCU, and because it provides a local scope.
It isolates the unregister waiters from other random users. I'm not
going to use this funky new flavour until I truly understand it.
Also, we actually want two scopes here, there is no reason for the
consumer unreg to wait for the retprobe stuff.
> 3. Regardless of RCU flavor, due to RCU protection, we have to add
> batched register/unregister APIs, so we can amortize sync_rcu cost
> during deregistration. Can we please agree on that as well? This is
> the main goal of this patch set and I'd like to land it before working
> further on changing and improving the rest of the locking schema.
See my patch here:
https://lkml.kernel.org/r/20240704084524.GC28838@noisy.programming.kicks-ass.net
I don't think it needs to be more complicated than that.
> I won't be happy about it, but just to move things forward, I can drop
> a) custom refcounting and/or b) percpu RW semaphore. Both are
> beneficial but not essential for batched APIs work. But if you force
> me to do that, please state clearly your reasons/arguments.
The reason I'm pushing RCU here is because AFAICT uprobes doesn't
actually need the stronger serialisation that rwlock (any flavour)
provide. It is a prime candidate for RCU, and I think you'll find plenty
papers / articles (by both Paul and others) that show that RCU scales
better.
As a bonus, you avoid that horrific write side cost that per-cpu rwsem
has.
The reason I'm not keen on that refcount thing was initially because I
did not understand the justification for it, but worse, once I did read
your justification, your very own numbers convinced me that the refcount
is fundamentally problematic, in any way shape or form.
> No one had yet pointed out why refcounting is broken
Your very own numbers point out that refcounting is a problem here.
> and why percpu RW semaphore is bad.
Literature and history show us that RCU -- where possible -- is
always better than any reader-writer locking scheme.
> 4. Another tactical thing, but an important one. Refcounting schema
> for uprobes. I've replied already, but I think refcounting is
> unavoidable for uretprobes,
I think we can fix that, I replied here:
https://lkml.kernel.org/r/20240704083152.GQ11386@noisy.programming.kicks-ass.net
> and current refcounting schema is
> problematic for batched APIs due to race between finding uprobe and
> there still being a possibility we'd need to undo all that and retry
> again.
Right, I've not looked too deeply at that, because I've not seen a
reason to actually change that. I can go think about it if you want, but
meh.
Powered by blists - more mailing lists