lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzbiDi8UOryvP_OHwbfF5VEBtGNzTnFRwuJ8PN+rBRxi+g@mail.gmail.com>
Date: Wed, 13 Mar 2024 10:01:00 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Andrii Nakryiko <andrii@...nel.org>, linux-trace-kernel@...r.kernel.org, 
	rostedt@...dmis.org, mhiramat@...nel.org, bpf@...r.kernel.org, 
	mathieu.desnoyers@...icios.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next 3/3] uprobes: add speculative lockless
 system-wide uprobe filter check

On Wed, Mar 13, 2024 at 6:20 AM Oleg Nesterov <oleg@...hat.com> wrote:
>
> I forgot everything about this code, plus it has changed a lot since
> I looked at it many years ago, but ...
>
> I think this change is fine but the changelog looks a bit confusing
> (overcomplicated) to me.

It's a new piece of code and logic, so I tried to do my due diligence
and argue why I think it's fine. I'll drop the overcomplicated
explanation, as I agree with you that it's inherently racy even
without my changes (and use-after-free safety is provided with
uprobe->register_rwsem independent from all this).

>
> On 03/12, Andrii Nakryiko wrote:
> >
> > This patch adds a speculative check before grabbing that rwlock. If
> > nr_systemwide is non-zero, lock is skipped and event is passed through.
> > From examining existing logic it looks correct and safe to do. If
> > nr_systemwide is being modified under rwlock in parallel, we have to
> > consider basically just one important race condition: the case when
> > nr_systemwide is dropped from one to zero (from
> > trace_uprobe_filter_remove()) under filter->rwlock, but
> > uprobe_perf_filter() raced and saw it as >0.
>
> Unless I am totally confused, there is nothing new. Even without
> this change trace_uprobe_filter_remove() can clear nr_systemwide
> right after uprobe_perf_filter() drops filter->rwlock.
>
> And of course, trace_uprobe_filter_add() can change nr_systemwide
> from 0 to 1. In this case uprobe_perf_func() can "wrongly" return
> UPROBE_HANDLER_REMOVE but we can't avoid this and afaics this is
> fine even if handler_chain() does unapply_uprobe(), uprobe_perf_open()
> will do uprobe_apply() after that, we can rely on ->register_rwsem.
>

yep, agreed

> > In case we speculatively read nr_systemwide as zero, while it was
> > incremented in parallel, we'll proceed to grabbing filter->rwlock and
> > re-doing the check, this time in lock-protected and non-racy way.
>
> See above...
>
>
> So I think uprobe_perf_filter() needs filter->rwlock only to iterate
> the list, it can check nr_systemwide lockless and this means that you
> can also remove the same check in __uprobe_perf_filter(), other callers
> trace_uprobe_filter_add/remove check it themselves.
>

makes sense, will do

>
> > --- a/kernel/trace/trace_uprobe.c
> > +++ b/kernel/trace/trace_uprobe.c
> > @@ -1351,6 +1351,10 @@ static bool uprobe_perf_filter(struct uprobe_consumer *uc,
> >       tu = container_of(uc, struct trace_uprobe, consumer);
> >       filter = tu->tp.event->filter;
> >
> > +     /* speculative check */
> > +     if (READ_ONCE(filter->nr_systemwide))
> > +             return true;
> > +
> >       read_lock(&filter->rwlock);
> >       ret = __uprobe_perf_filter(filter, mm);
> >       read_unlock(&filter->rwlock);
>
> ACK,
>
> but see above. I think the changelog should be simplified and the
> filter->nr_systemwide check in __uprobe_perf_filter() should be
> removed. But I won't insist and perhaps I missed something...
>

I think you are right, I'll move the check

> Oleg.
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ