[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7B76F9D75FD26D716624004B@nimrod.local>
Date: Sun, 08 May 2011 13:18:55 +0100
From: Alex Bligh <alex@...x.org.uk>
To: Alex Bligh <alex@...x.org.uk>,
Eric Dumazet <eric.dumazet@...il.com>
cc: netdev@...r.kernel.org,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Alex Bligh <alex@...x.org.uk>
Subject: Re: Scalability of interface creation and deletion
--On 8 May 2011 10:35:02 +0100 Alex Bligh <alex@...x.org.uk> wrote:
> I suspect this may just mean an rcu reader holds the rcu_read_lock
> for a jiffies related time. Though I'm having difficulty seeing
> what that might be on a system where the net is in essence idle.
Having read the RCU docs, this can't be right, because blocking
is not legal when in the rcu_read_lock critical section.
The system concerned is an 8 cpu system but I get comparable
results on a 2 cpu system.
I am guessing that when the synchronize_sched() happens, all cores
but the cpu on which that is executing are idle (at least on
the vast majority of calls) as the machine itself is idle.
As I understand, RCU synchronization (in the absence of lots
of callbacks etc.) is meant to wait until it knows all RCU
read critical sections which are running on entry have
been left. It exploits the fact that RCU read critical sections
cannot block by waiting for a context switch on each cpu, OR
for that cpu to be in the idle state or running user code (also
incompatible with a read critical section).
The fact that increasing HZ masks the problem seems to imply that
sychronize_sched() is waiting when it shouldn't be, as it suggests
it's waiting for a context switch. But surely it shouldn't be
waiting for context switch if all other cpu cores are idle?
It knows that it (the caller) doesn't hold an rcu_read_lock,
and presumably can see the other cpus are in the idle state,
in which case surely it should return immediately? Distribution
of latency in synchronize_sched() looks like this:
20-49 us 110 instances (27.500%)
50-99 us 45 instances (11.250%)
5000-9999 us 5 instances (1.250%)
10000-19999 us 33 instances (8.250%)
20000-49999 us 4 instances (1.000%)
50000-99999 us 191 instances (47.750%)
100000-199999 us 12 instances (3.000%)
--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists