[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4Bzb7gQEKk26B4-KXS_ht8LyCRA7SdThc7ZS05gOEuNZjrQ@mail.gmail.com>
Date: Tue, 20 Aug 2024 11:01:13 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Andrii Nakryiko <andrii@...nel.org>, linux-trace-kernel@...r.kernel.org,
peterz@...radead.org, rostedt@...dmis.org, mhiramat@...nel.org,
bpf@...r.kernel.org, linux-kernel@...r.kernel.org, jolsa@...nel.org,
paulmck@...nel.org, willy@...radead.org, surenb@...gle.com,
akpm@...ux-foundation.org, linux-mm@...ck.org
Subject: Re: [PATCH RFC v3 09/13] uprobes: SRCU-protect uretprobe lifetime
(with timeout)
On Tue, Aug 20, 2024 at 8:06 AM Oleg Nesterov <oleg@...hat.com> wrote:
>
> On 08/19, Andrii Nakryiko wrote:
> >
> > On Mon, Aug 19, 2024 at 6:41 AM Oleg Nesterov <oleg@...hat.com> wrote:
> > >
> > > On 08/12, Andrii Nakryiko wrote:
> > > >
> > > > Avoid taking refcount on uprobe in prepare_uretprobe(), instead take
> > > > uretprobe-specific SRCU lock and keep it active as kernel transfers
> > > > control back to user space.
> > > ...
> > > > include/linux/uprobes.h | 49 ++++++-
> > > > kernel/events/uprobes.c | 294 ++++++++++++++++++++++++++++++++++------
> > > > 2 files changed, 301 insertions(+), 42 deletions(-)
> > >
> > > Oh. To be honest I don't like this patch.
> > >
> > > I would like to know what other reviewers think, but to me it adds too many
> > > complications that I can't even fully understand...
> >
> > Which parts? The atomic xchg() and cmpxchg() parts? What exactly do
> > you feel like you don't fully understand?
>
> Heh, everything looks too complex for me ;)
Well, the best code is no code. But I'm not doing this just for fun,
so I'm happy with the simplest solution *that works*.
>
> Say, hprobe_expire(). It is also called by ri_timer() in softirq context,
> right? And it does
>
> /* We lost the race, undo our refcount bump. It can drop to zero. */
> put_uprobe(uprobe);
>
> How so? If the refcount goes to zero, put_uprobe() does mutex_lock(),
> but it must not sleep in softirq context.
>
Now we are talking about specific issues, thank you! It's hard to
discuss "I don't like".
Yes, I think you are right and it is certainly a problem with
put_uprobe() that it can't be called from softirq (just having to
remember that is error-prone, as is evidenced by me forgetting about
this issue).
It's easy enough to solve. We can either schedule work from timer
thread (and that will solve this particular issue only), or we can
teach put_uprobe() to schedule work item if it drops refcount to zero
from softirq and other restricted contexts.
I vote for making put_uprobe() flexible in this regard, add
work_struct to uprobe, and schedule all this to be done in the work
queue callback. WDYT?
>
> Or, prepare_uretprobe() which does
>
> rcu_assign_pointer(utask->return_instances, ri);
>
> if (!timer_pending(&utask->ri_timer))
> mod_timer(&utask->ri_timer, ...);
>
> Suppose that the timer was pending and it was fired right before
> rcu_assign_pointer(). What guarantees that prepare_uretprobe() will see
> timer_pending() == false?
>
> rcu_assign_pointer()->smp_store_release() is a one-way barrier. This
> timer_pending() check may appear to happen before rcu_assign_pointer()
> completes.
>
> In this (yes, theoretical) case ri_timer() can miss the new return_instance,
> while prepare_uretprobe() can miss the necessary mod_timer(). I think this
> needs another mb() in between.
>
Ok, that's fair. I felt like this pattern might be a bit problematic,
but I also felt like it's good to have to ensure that we do
occasionally have a race between timer callback and uretprobe, even if
uretprobe returns very quickly. (and I did confirm we get those races
and they seem to be handled fine, i.e., I saw uprobes being "expired"
into refcounted ones from ri_timer)
But the really simple way to solve this is to do unconditional
mod_timer(), so I can do just that to keep this less tricky. Would you
be ok with that?
>
> And I can't convince myself hprobe_expire() is correct... OK, I don't
> fully understand the logic, but why data_race(READ_ONCE(hprobe->leased)) ?
> READ_ONCE() should be enough in this case?
you mean why data_race() annotation? To appease KCSAN, given that we
modify hprobe->leased with xchg/cmpxchg, but read here with
READ_ONCE(). Maybe I'm overthinking it, not sure. There is a reason
why this is an RFC ;)
>
>
> > > As I have already mentioned in the previous discussions, we can simply kill
> > > utask->active_uprobe. And utask->auprobe.
> >
> > I don't have anything against that, in principle, but let's benchmark
> > and test that thoroughly. I'm a bit uneasy about the possibility that
> > some arch-specific code will do container_of() on this arch_uprobe in
> > order to get to uprobe,
>
> Well, struct uprobe is not "exported", the arch-specific code can't do this.
>
Ah, good point, that's great. But as I said, uretprobe is actually
*the common* use case, not single-stepped uprobe. Still not very happy
about that memcpy() and assumption that it's cheap, but that's minor.
But no matter what we do for single-stepped one, uretprobe needs some
solution. (and if that solution works for uretprobe, why wouldn't it
work for single-step?...)
> Oleg.
>
Powered by blists - more mailing lists