[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150616204711.0e6ea1d7@grimm.local.home>
Date: Tue, 16 Jun 2015 20:47:11 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Alexei Starovoitov <ast@...mgrid.com>
Cc: Daniel Wagner <wagi@...om.org>, paulmck@...ux.vnet.ibm.com,
Daniel Wagner <daniel.wagner@...-carit.de>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: call_rcu from trace_preempt
On Tue, 16 Jun 2015 17:33:24 -0700
Alexei Starovoitov <ast@...mgrid.com> wrote:
> On 6/16/15 10:37 AM, Steven Rostedt wrote:
> >>> + kfree(l);
> >> >
> >> >that's not right, since such thread defeats rcu protection of lookup.
> >> >We need either kfree_rcu/call_rcu or synchronize_rcu.
> >> >Obviously the former is preferred that's why I'm still digging into it.
> >> >Probably a thread that does kfree_rcu would be ok, but we shouldn't
> >> >be doing it unconditionally. For all networking programs and 99%
> >> >of tracing programs the existing code is fine and I don't want to
> >> >slow it down to tackle the corner case.
> >> >Extra spin_lock just to add it to the list is also quite costly.
> > Use a irq_work() handler to do the kfree_rcu(), and use llist (lockless
> > list) to add items to the list.
>
> have been studying irq_work and llist... it will work, but it's quite
> costly too. Every kfree_rcu will be replaced with irq_work_queue(),
> which is irq_work_claim() with one lock_cmpxchg plus another
> lock_cmpxchg in llist_add, plus another lock_cmpxchg for our own llist
> of 'to be kfree_rcu-ed htab elements'. That's a lot.
> The must be better solution. Need to explore more.
Do what I do in tracing. Use a bit (per cpu?) test.
Add the element to the list (that will be a cmpxchg, but I'm not sure
you can avoid it), then check the bit to see if the irq work is already
been activated. If not, then activate the irq work and set the bit.
Then you will not have any more cmpxchg in the fast path.
In your irq work handler, you clear the bit, process all the entries
until they are empty, check if the bit is set again, and repeat.
I haven't looked at the thread before I was added to the Cc, so I'm
answering this out of context.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists