[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171220083121.4aaa8a2a@redhat.com>
Date: Wed, 20 Dec 2017 08:31:21 +0100
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: Rao Shoaib <rao.shoaib@...cle.com>
Cc: linux-kernel@...r.kernel.org, paulmck@...ux.vnet.ibm.com,
linux-mm@...ck.org, brouer@...hat.com
Subject: Re: [PATCH] kfree_rcu() should use the new kfree_bulk() interface
for freeing rcu structures
On Tue, 19 Dec 2017 13:20:43 -0800 Rao Shoaib <rao.shoaib@...cle.com> wrote:
> On 12/19/2017 12:41 PM, Jesper Dangaard Brouer wrote:
> > On Tue, 19 Dec 2017 09:52:27 -0800 rao.shoaib@...cle.com wrote:
> >
> >> +/* Main RCU function that is called to free RCU structures */
> >> +static void
> >> +__rcu_bulk_free(struct rcu_head *head, rcu_callback_t func, int cpu, bool lazy)
> >> +{
> >> + unsigned long offset;
> >> + void *ptr;
> >> + struct rcu_bulk_free *rbf;
> >> + struct rcu_bulk_free_container *rbfc = NULL;
> >> +
> >> + rbf = this_cpu_ptr(&cpu_rbf);
> >> +
> >> + if (unlikely(!rbf->rbf_init)) {
> >> + spin_lock_init(&rbf->rbf_lock);
> >> + rbf->rbf_cpu = smp_processor_id();
> >> + rbf->rbf_init = true;
> >> + }
> >> +
> >> + /* hold lock to protect against other cpu's */
> >> + spin_lock_bh(&rbf->rbf_lock);
> >
> > I'm not sure this will be faster. Having to take a cross CPU lock here
> > (+ BH-disable) could cause scaling issues. Hopefully this lock will
> > not be used intensively by other CPUs, right?
> >
[...]
>
> As Paul has pointed out the lock is a per-cpu lock, the only reason for
> another CPU to access this lock is if the rcu callbacks run on a
> different CPU and there is nothing the code can do to avoid that but
> that should be rare anyways.
(loop in Paul's comment)
On Tue, 19 Dec 2017 12:56:29 -0800
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> Isn't this lock in a per-CPU object? It -might- go cross-CPU in response
> to CPU-hotplug operations, but that should be rare.
Point taken. If this lock is very unlikely to be taken on another CPU
then I withdraw my performance concerns (the cacheline can hopefully
stay in Modified(M) state on this CPU, and all other CPUs will have in
in Invalid(I) state based on MESI cache coherence protocol view[1]).
The lock's atomic operation does have some overhead, and _later_ if we
could get fancy and use seqlock (include/linux/seqlock.h) to remove
that.
[1] https://en.wikipedia.org/wiki/MESI_protocol
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
Powered by blists - more mailing lists