netdev - Re: [PATCH] net: use synchronize_rcu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1306252364.3026.63.camel@edumazet-laptop>
Date:	Tue, 24 May 2011 17:52:44 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	paulmck@...ux.vnet.ibm.com
Cc:	David Miller <davem@...emloft.net>, netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH] net: use synchronize_rcu_expedited()

Le mardi 24 mai 2011 à 08:44 -0700, Paul E. McKenney a écrit :
> On Tue, May 24, 2011 at 11:07:32AM +0200, Eric Dumazet wrote:
> > synchronize_rcu() is very slow in various situations (HZ=100,
> > CONFIG_NO_HZ=y, CONFIG_PREEMPT=n)
> > 
> > Extract from my (mostly idle) 8 core machine :
> > 
> >  synchronize_rcu() in 99985 us
> >  synchronize_rcu() in 79982 us
> >  synchronize_rcu() in 87612 us
> >  synchronize_rcu() in 79827 us
> >  synchronize_rcu() in 109860 us
> >  synchronize_rcu() in 98039 us
> >  synchronize_rcu() in 89841 us
> >  synchronize_rcu() in 79842 us
> >  synchronize_rcu() in 80151 us
> >  synchronize_rcu() in 119833 us
> >  synchronize_rcu() in 99858 us
> >  synchronize_rcu() in 73999 us
> >  synchronize_rcu() in 79855 us
> >  synchronize_rcu() in 79853 us
> > 
> > 
> > When we hold RTNL mutex, we would like to spend some cpu cycles but not
> > block too long other processes waiting for this mutex.
> > 
> > We also want to setup/dismantle network features as fast as possible at
> > boot/shutdown time.
> > 
> > This patch makes synchronize_net() call the expedited version if RTNL is
> > locked.
> > 
> > synchronize_rcu_expedited() typical delay is about 20 us on my machine.
> > 
> >  synchronize_rcu_expedited() in 18 us
> >  synchronize_rcu_expedited() in 18 us
> >  synchronize_rcu_expedited() in 18 us
> >  synchronize_rcu_expedited() in 18 us
> >  synchronize_rcu_expedited() in 20 us
> >  synchronize_rcu_expedited() in 16 us
> >  synchronize_rcu_expedited() in 20 us
> >  synchronize_rcu_expedited() in 18 us
> >  synchronize_rcu_expedited() in 18 us
> 
> Cool!!!
> 
> Just out of curiosity, how many CPUs does your system have?

16 (2x4x2)  [ processor.max_cstate=1 ]

I am now trying to optimize rcu_barrier(), if you have an idea to get an
expedited version as well ?

We can see in following trace 3 groups, spaced by one jiffie (HZ=100)

Maybe we can avoid sending a call_rcu() to a cpu that has no pending rcu
work ?

[  835.189996] cpu0 synchronize_rcu_expedited() in 30 us 
   -> begin rcu_barrier() immediately
[  835.259702] cpu15 rcu_barrier_callback()
[  835.259705] cpu14 rcu_barrier_callback()
[  835.259708] cpu7 rcu_barrier_callback()
[  835.259711] cpu12 rcu_barrier_callback()
[  835.259714] cpu8 rcu_barrier_callback()
[  835.259716] cpu1 rcu_barrier_callback()
[  835.259719] cpu0 rcu_barrier_callback()

[  835.269691] cpu13 rcu_barrier_callback()
[  835.269695] cpu11 rcu_barrier_callback()
[  835.269698] cpu5 rcu_barrier_callback()
[  835.269700] cpu6 rcu_barrier_callback()
[  835.269702] cpu10 rcu_barrier_callback()
[  835.269705] cpu3 rcu_barrier_callback()
[  835.269707] cpu2 rcu_barrier_callback()

[  835.279687] cpu4 rcu_barrier_callback()
[  835.279689] cpu9 rcu_barrier_callback()
[  835.279744] cpu0 rcu_barrier() in 89499 us

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html