[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110524192451.GH2266@linux.vnet.ibm.com>
Date: Tue, 24 May 2011 12:24:51 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: David Miller <davem@...emloft.net>, netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH] net: use synchronize_rcu_expedited()
On Tue, May 24, 2011 at 05:52:44PM +0200, Eric Dumazet wrote:
> Le mardi 24 mai 2011 à 08:44 -0700, Paul E. McKenney a écrit :
> > On Tue, May 24, 2011 at 11:07:32AM +0200, Eric Dumazet wrote:
> > > synchronize_rcu() is very slow in various situations (HZ=100,
> > > CONFIG_NO_HZ=y, CONFIG_PREEMPT=n)
> > >
> > > Extract from my (mostly idle) 8 core machine :
> > >
> > > synchronize_rcu() in 99985 us
> > > synchronize_rcu() in 79982 us
> > > synchronize_rcu() in 87612 us
> > > synchronize_rcu() in 79827 us
> > > synchronize_rcu() in 109860 us
> > > synchronize_rcu() in 98039 us
> > > synchronize_rcu() in 89841 us
> > > synchronize_rcu() in 79842 us
> > > synchronize_rcu() in 80151 us
> > > synchronize_rcu() in 119833 us
> > > synchronize_rcu() in 99858 us
> > > synchronize_rcu() in 73999 us
> > > synchronize_rcu() in 79855 us
> > > synchronize_rcu() in 79853 us
> > >
> > >
> > > When we hold RTNL mutex, we would like to spend some cpu cycles but not
> > > block too long other processes waiting for this mutex.
> > >
> > > We also want to setup/dismantle network features as fast as possible at
> > > boot/shutdown time.
> > >
> > > This patch makes synchronize_net() call the expedited version if RTNL is
> > > locked.
> > >
> > > synchronize_rcu_expedited() typical delay is about 20 us on my machine.
> > >
> > > synchronize_rcu_expedited() in 18 us
> > > synchronize_rcu_expedited() in 18 us
> > > synchronize_rcu_expedited() in 18 us
> > > synchronize_rcu_expedited() in 18 us
> > > synchronize_rcu_expedited() in 20 us
> > > synchronize_rcu_expedited() in 16 us
> > > synchronize_rcu_expedited() in 20 us
> > > synchronize_rcu_expedited() in 18 us
> > > synchronize_rcu_expedited() in 18 us
> >
> > Cool!!!
> >
> > Just out of curiosity, how many CPUs does your system have?
>
> 16 (2x4x2) [ processor.max_cstate=1 ]
>
> I am now trying to optimize rcu_barrier(), if you have an idea to get an
> expedited version as well ?
>
> We can see in following trace 3 groups, spaced by one jiffie (HZ=100)
>
> Maybe we can avoid sending a call_rcu() to a cpu that has no pending rcu
> work ?
Might make sense, though most of the gains would need to come from
kicking the grace-period machinery hard in order to make it go faster.
Interesting -- I will give this some thought.
Thanx, Paul
> [ 835.189996] cpu0 synchronize_rcu_expedited() in 30 us
> -> begin rcu_barrier() immediately
> [ 835.259702] cpu15 rcu_barrier_callback()
> [ 835.259705] cpu14 rcu_barrier_callback()
> [ 835.259708] cpu7 rcu_barrier_callback()
> [ 835.259711] cpu12 rcu_barrier_callback()
> [ 835.259714] cpu8 rcu_barrier_callback()
> [ 835.259716] cpu1 rcu_barrier_callback()
> [ 835.259719] cpu0 rcu_barrier_callback()
>
> [ 835.269691] cpu13 rcu_barrier_callback()
> [ 835.269695] cpu11 rcu_barrier_callback()
> [ 835.269698] cpu5 rcu_barrier_callback()
> [ 835.269700] cpu6 rcu_barrier_callback()
> [ 835.269702] cpu10 rcu_barrier_callback()
> [ 835.269705] cpu3 rcu_barrier_callback()
> [ 835.269707] cpu2 rcu_barrier_callback()
>
> [ 835.279687] cpu4 rcu_barrier_callback()
> [ 835.279689] cpu9 rcu_barrier_callback()
> [ 835.279744] cpu0 rcu_barrier() in 89499 us
>
> Thanks
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists