linux-kernel - Re: [PATCH] uprobes: get rid of bogus trace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20240808064353.7470f6bfab89bd28dbcdebe0@kernel.org>
Date: Thu, 8 Aug 2024 06:43:53 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: Jiri Olsa <olsajiri@...il.com>, Andrii Nakryiko <andrii@...nel.org>,
 linux-trace-kernel@...r.kernel.org, rostedt@...dmis.org,
 mhiramat@...nel.org, peterz@...radead.org, oleg@...hat.com,
 bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] uprobes: get rid of bogus trace_uprobe hit counter

On Tue, 6 Aug 2024 10:26:25 -0700
Andrii Nakryiko <andrii.nakryiko@...il.com> wrote:

> On Tue, Aug 6, 2024 at 12:37 AM Jiri Olsa <olsajiri@...il.com> wrote:
> >
> > On Mon, Aug 05, 2024 at 01:28:03PM -0700, Andrii Nakryiko wrote:
> > > trace_uprobe->nhit counter is not incremented atomically, so its value
> > > is bogus in practice. On the other hand, it's actually a pretty big
> > > uprobe scalability problem due to heavy cache line bouncing between CPUs
> > > triggering the same uprobe.
> >
> > so you're seeing that in the benchmark, right? I'm curious how bad
> > the numbers are
> >
> 
> Yes. So, once we get rid of all the uprobe/uretprobe/mm locks (ongoing
> work), this one was the last limiter to linear scalability.
> 
> With this counter, I was topping out at about 12 mln/s uprobe
> triggering (I think it was 32 CPUs, but I don't remember exactly now).
> About 30% of CPU cycles were spent in this increment.
> 
> But those 30% don't paint the full picture. Once the counter is
> removed, the same uprobe throughput jumps to 62 mln/s or so. So we
> definitely have to do something about it.
> 
> > >
> > > Drop it and emit obviously unrealistic value in its stead in
> > > uporbe_profiler seq file.
> > >
> > > The alternative would be allocating per-CPU counter, but I'm not sure
> > > it's justified.
> > >
> > > Signed-off-by: Andrii Nakryiko <andrii@...nel.org>
> > > ---
> > >  kernel/trace/trace_uprobe.c | 4 +---
> > >  1 file changed, 1 insertion(+), 3 deletions(-)
> > >
> > > diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> > > index 52e76a73fa7c..5d38207db479 100644
> > > --- a/kernel/trace/trace_uprobe.c
> > > +++ b/kernel/trace/trace_uprobe.c
> > > @@ -62,7 +62,6 @@ struct trace_uprobe {
> > >       struct uprobe                   *uprobe;
> > >       unsigned long                   offset;
> > >       unsigned long                   ref_ctr_offset;
> > > -     unsigned long                   nhit;
> > >       struct trace_probe              tp;
> > >  };
> > >
> > > @@ -821,7 +820,7 @@ static int probes_profile_seq_show(struct seq_file *m, void *v)
> > >
> > >       tu = to_trace_uprobe(ev);
> > >       seq_printf(m, "  %s %-44s %15lu\n", tu->filename,
> > > -                     trace_probe_name(&tu->tp), tu->nhit);
> > > +                trace_probe_name(&tu->tp), ULONG_MAX);
> >
> > seems harsh.. would it be that bad to create per cpu counter for that?
> 
> Well, consider this patch a conversation starter. There are two
> reasons why I'm removing the counter instead of doing per-CPU one:
> 
>   - it's less work to send out a patch pointing out the problem (but
> the solution might change)
>   - this counter was never correct in the presence of multiple
> threads, so I'm not sure how useful it is.
> 
> Yes, I think we can do per-CPU counters, but do we want to pay the
> memory price? That's what I want to get from Masami, Steven, or Peter
> (whoever cares enough).

I would like to make it per-cpu counter *and* make it kconfig optional.
Or just remove with the file (but it changes the user interface without
option).

For the kprobes, the profile file is useful because it shows "missed"
counter. This tells user whether your trace data drops some events or not.
But if uprobes profile only shows the number of hit, we can use the
histogram trigger if needed.

Thank you,

> 
> >
> > jirka
> >
> > >       return 0;
> > >  }
> > >
> > > @@ -1507,7 +1506,6 @@ static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs)
> > >       int ret = 0;
> > >
> > >       tu = container_of(con, struct trace_uprobe, consumer);
> > > -     tu->nhit++;
> > >
> > >       udd.tu = tu;
> > >       udd.bp_addr = instruction_pointer(regs);
> > > --
> > > 2.43.5
> > >


-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>