linux-kernel - Re: [PATCH 1/3] uprobes: allow put_uprobe() from non-sleepable softirq context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAEf4BzZ7=NFAUB_GzAt1SCO=LnCFSbqX_NThtjrs8EfkjBUr7A@mail.gmail.com>
Date: Tue, 17 Sep 2024 10:19:52 +0200
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Andrii Nakryiko <andrii@...nel.org>, linux-trace-kernel@...r.kernel.org, 
	peterz@...radead.org, rostedt@...dmis.org, mhiramat@...nel.org, 
	bpf@...r.kernel.org, linux-kernel@...r.kernel.org, jolsa@...nel.org, 
	paulmck@...nel.org
Subject: Re: [PATCH 1/3] uprobes: allow put_uprobe() from non-sleepable
 softirq context

On Sun, Sep 15, 2024 at 4:49 PM Oleg Nesterov <oleg@...hat.com> wrote:
>
> On 09/09, Andrii Nakryiko wrote:
> >
> > Currently put_uprobe() might trigger mutex_lock()/mutex_unlock(), which
> > makes it unsuitable to be called from more restricted context like softirq.
> >
> > Let's make put_uprobe() agnostic to the context in which it is called,
> > and use work queue to defer the mutex-protected clean up steps.
>
> ...
>
> > +static void uprobe_free_deferred(struct work_struct *work)
> > +{
> > +     struct uprobe *uprobe = container_of(work, struct uprobe, work);
> > +
> > +     /*
> > +      * If application munmap(exec_vma) before uprobe_unregister()
> > +      * gets called, we don't get a chance to remove uprobe from
> > +      * delayed_uprobe_list from remove_breakpoint(). Do it here.
> > +      */
> > +     mutex_lock(&delayed_uprobe_lock);
> > +     delayed_uprobe_remove(uprobe, NULL);
> > +     mutex_unlock(&delayed_uprobe_lock);
> > +
> > +     kfree(uprobe);
> > +}
> > +
> >  static void uprobe_free_rcu(struct rcu_head *rcu)
> >  {
> >       struct uprobe *uprobe = container_of(rcu, struct uprobe, rcu);
> >
> > -     kfree(uprobe);
> > +     INIT_WORK(&uprobe->work, uprobe_free_deferred);
> > +     schedule_work(&uprobe->work);
> >  }
>
> This is still wrong afaics...
>
> If put_uprobe() can be called from softirq (after the next patch), then
> put_uprobe() and all other users of uprobes_treelock should use
> write_lock_bh/read_lock_bh to avoid the deadlock.

Ok, I see the problem, that's unfortunate.

I see three ways to handle that:

1) keep put_uprobe() as is, and instead do schedule_work() from the
timer thread to postpone put_uprobe(). (but I'm not a big fan of this)
2) move uprobes_treelock part of put_uprobe() into rcu callback, I
think it has no bearing on correctness, uprobe_is_active() is there
already to handle races between putting uprobe and removing it from
uprobes_tree (I prefer this one over #1 )
3) you might like this the most ;) I think I can simplify
hprobes_expire() from patch #2 to not need put_uprobe() at all, if I
protect uprobe lifetime with non-sleepable
rcu_read_lock()/rcu_read_unlock() and perform try_get_uprobe() as the
very last step after cmpxchg() succeeded.

I'm leaning towards #3, but #2 seems fine to me as well.

>
> To be honest... I simply can't force myself to even try to read 2/3 ;) I'll
> try to do this later, but I am sure I will never like it, sorry.

This might sound rude, but the goal here is not to make you like it :)
The goal is to improve performance with minimal complexity. And I'm
very open to any alternative proposals as to how to make uretprobes
RCU-protected to avoid refcounting in the hot path.

I think #3 proposal above will make it a bit more palatable (but there
is still locklessness, cmpxchg, etc, I see no way around that,
unfortunately).

>
> Oleg.
>