[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200828084411.GP1362448@hirez.programming.kicks-ass.net>
Date: Fri, 28 Aug 2020 10:44:11 +0200
From: peterz@...radead.org
To: Masami Hiramatsu <mhiramat@...nel.org>
Cc: linux-kernel@...r.kernel.org, Eddy_Wu@...ndmicro.com,
x86@...nel.org, davem@...emloft.net, rostedt@...dmis.org,
naveen.n.rao@...ux.ibm.com, anil.s.keshavamurthy@...el.com,
linux-arch@...r.kernel.org, cameron@...dycamel.com,
oleg@...hat.com, will@...nel.org, paulmck@...nel.org
Subject: Re: [RFC][PATCH 3/7] kprobes: Remove kretprobe hash
On Fri, Aug 28, 2020 at 03:00:59AM +0900, Masami Hiramatsu wrote:
> On Thu, 27 Aug 2020 18:12:40 +0200
> Peter Zijlstra <peterz@...radead.org> wrote:
>
> > +static void invalidate_rp_inst(struct task_struct *t, struct kretprobe *rp)
> > +{
> > + struct invl_rp_ipi iri = {
> > + .task = t,
> > + .rp = rp,
> > + .done = false
> > + };
> > +
> > + for (;;) {
> > + if (try_invoke_on_locked_down_task(t, __invalidate_rp_inst, rp))
> > + return;
> > +
> > + smp_call_function_single(task_cpu(t), __invalidate_rp_ipi, &iri, 1);
> > + if (iri.done)
> > + return;
> > + }
>
> Hmm, what about making a status place holder and point it from
> each instance to tell it is valid or not?
>
> struct kretprobe_holder {
> atomic_t refcnt;
> struct kretprobe *rp;
> };
>
> struct kretprobe {
> ...
> struct kretprobe_holder *rph; // allocate at register
> ...
> };
>
> struct kretprobe_instance {
> ...
> struct kretprobe_holder *rph; // free if refcnt == 0
> ...
> };
>
> cleanup_rp_inst(struct kretprobe *rp)
> {
> rp->rph->rp = NULL;
> }
>
> kretprobe_trampoline_handler()
> {
> ...
> rp = READ_ONCE(ri->rph-rp);
> if (likely(rp)) {
> // call rp->handler
> } else
> rcu_call(ri, free_rp_inst_rcu);
> ...
> }
>
> free_rp_inst_rcu()
> {
> if (!atomic_dec_return(ri->rph->refcnt))
> kfree(ri->rph);
> kfree(ri);
> }
>
> This increase kretprobe_instance a bit, but make things simpler.
> (and still keep lockless, atomic op is in the rcu callback).
Yes, much better.
Although I'd _love_ to get rid of rp->data_size, then we can simplify
all of this even more. I was thinking we could then have a single global
freelist thing and add some per-cpu cache to it (say 4-8 entries) to
avoid the worst contention.
And then make function-graph use this, instead of the other way around
:-)
Powered by blists - more mailing lists