lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzaycyKxX7KFim0_C+FobPSDcfahUGA4mgfiz_d81noSGQ@mail.gmail.com>
Date: Mon, 15 Jul 2024 10:34:22 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...nel.org, andrii@...nel.org, oleg@...hat.com, 
	linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org, 
	rostedt@...dmis.org, mhiramat@...nel.org, jolsa@...nel.org, clm@...a.com, 
	paulmck@...nel.org, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH v2 11/11] perf/uprobe: Add uretprobe timer

On Mon, Jul 15, 2024 at 4:41 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Fri, Jul 12, 2024 at 02:43:52PM -0700, Andrii Nakryiko wrote:
> > + bpf
> >
> > On Thu, Jul 11, 2024 at 4:07 AM Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > In order to put a bound on the uretprobe_srcu critical section, add a
> > > timer to uprobe_task. Upon every RI added or removed the timer is
> > > pushed forward to now + 1s. If the timer were ever to fire, it would
> > > convert the SRCU 'reference' to a refcount reference if possible.
> > >
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> > > ---
> > >  include/linux/uprobes.h |    8 +++++
> > >  kernel/events/uprobes.c |   67 ++++++++++++++++++++++++++++++++++++++++++++----
> > >  2 files changed, 69 insertions(+), 6 deletions(-)
> > >
> > > --- a/include/linux/uprobes.h
> > > +++ b/include/linux/uprobes.h
> > > @@ -15,6 +15,7 @@
> > >  #include <linux/rbtree.h>
> > >  #include <linux/types.h>
> > >  #include <linux/wait.h>
> > > +#include <linux/timer.h>
> > >
> > >  struct vm_area_struct;
> > >  struct mm_struct;
> > > @@ -79,6 +80,10 @@ struct uprobe_task {
> > >         struct return_instance          *return_instances;
> > >         unsigned int                    depth;
> > >         unsigned int                    active_srcu_idx;
> > > +
> > > +       struct timer_list               ri_timer;
> > > +       struct callback_head            ri_task_work;
> > > +       struct task_struct              *task;
> > >  };
> > >
> > >  struct return_instance {
> > > @@ -86,7 +91,8 @@ struct return_instance {
> > >         unsigned long           func;
> > >         unsigned long           stack;          /* stack pointer */
> > >         unsigned long           orig_ret_vaddr; /* original return address */
> > > -       bool                    chained;        /* true, if instance is nested */
> > > +       u8                      chained;        /* true, if instance is nested */
> > > +       u8                      has_ref;
> >
> > Why bool -> u8 switch? You don't touch chained, so why change its
> > type? And for has_ref you interchangeably use 0 and true for the same
> > field. Let's stick to bool as there is nothing wrong with it?
>
> sizeof(_Bool) is implementation defined. It is 1 for x86_64, but there
> are platforms where it ends up begin 4 (some PowerPC ABIs among others.
> I didn't want to grow this structure for no reason.

There are tons of bools in the kernel, surprised that we (kernel
makefiles) don't do anything on PowerPC to keep it consistent with the
rest of the world. Oh well, it just kind of looks off when there is a
mix of 0 and true used for the same field.

>
> > >         int                     srcu_idx;
> > >
> > >         struct return_instance  *next;          /* keep as stack */
> >
> > [...]
> >
> > > @@ -1822,13 +1864,20 @@ static int dup_utask(struct task_struct
> > >                         return -ENOMEM;
> > >
> > >                 *n = *o;
> > > -               __srcu_clone_read_lock(&uretprobes_srcu, n->srcu_idx);
> > > +               if (n->uprobe) {
> > > +                       if (n->has_ref)
> > > +                               get_uprobe(n->uprobe);
> > > +                       else
> > > +                               __srcu_clone_read_lock(&uretprobes_srcu, n->srcu_idx);
> > > +               }
> > >                 n->next = NULL;
> > >
> > >                 *p = n;
> > >                 p = &n->next;
> > >                 n_utask->depth++;
> > >         }
> > > +       if (n_utask->return_instances)
> > > +               mod_timer(&n_utask->ri_timer, jiffies + HZ);
> >
> > let's add #define for HZ, so it's adjusted in just one place (instead
> > of 3 as it is right now)
>
> Can do I suppose.

thanks!

>
> > Also, we can have up to 64 levels of uretprobe nesting, so,
> > technically, the user can cause a delay of 64 seconds in total. Maybe
> > let's use something smaller than a full second? After all, if the
> > user-space function has high latency, then this refcount congestion is
> > much less of a problem. I'd set it to something like 50-100 ms for
> > starters.
>
> Before you know it we'll have a sysctl :/ But sure, we can do something
> shorter.

:) let's hope we won't need sysctl (I don't think we will, FWIW)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ