[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090225170825.GC6797@linux.vnet.ibm.com>
Date: Wed, 25 Feb 2009 09:08:25 -0800
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: linux-kernel@...r.kernel.org, vegard.nossum@...il.com,
stable@...nel.org, akpm@...ux-foundation.org, npiggin@...e.de,
penberg@...helsinki.fi
Subject: Re: [PATCH] v4 Teach RCU that idle task is not quiscent state at
boot
On Wed, Feb 25, 2009 at 05:00:24PM +0100, Ingo Molnar wrote:
>
> * Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote:
>
> > +/* Internal to kernel, but needed by rcupreempt.h. */
> > +extern int rcu_idle_cpu_truthful;
>
> The name sucks a bit ;-) 'truthful' is an emotionally laden
> statement and distracts from the technical purpose when reading
> it ;)
>
> Same for:
>
> > +extern void rcu_idle_now_means_idle(void);
I must confess that I was in fact a bit annoyed to learn that idle_cpu()
telling me that a decidedly active CPU was idle. And if you think -these-
names are emotionally laden, you should have seen my first choices. ;-)
Nevertheless, point well taken. How about the following names instead?
extern int rcu_scheduler_active;
extern void rcu_scheduler_starting(void);
> Also, i'm wondering, is there really no way to avoid this quirk.
> We almost got away without it for a long time.
A bit scary, isn't it? ;-)
Looks to me that some RCU usage finally found its way into the early
boot code path -- until that happened, no one needed to care what RCU
mistakenly thought was happening with grace periods at early boot.
I list some alternatives below.
> This one:
>
> > void rcu_check_callbacks(int cpu, int user)
> > {
> > if (user ||
> > - (idle_cpu(cpu) && !in_softirq() &&
> > - hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
> > + (idle_cpu(cpu) && rcu_idle_cpu_truthful &&
> > + !in_softirq() && hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
>
> Is a hotpath called very often ...
The "if" statement is indeed on the hotpath, but the additional check
of rcu_idle_cpu_truthful is reached only if we didn't interrupt from
user-mode execution and if the CPU is idle. So the only time anything
is delayed by this extra check is when we take an interrupt from idle
state, and then that interrupt handler is itself interrupted by the
scheduling-clock interrupt.
And of course, in the CONFIG_NO_HZ case, this code path is normally
disabled entirely for an idle CPU.
This therefore will not result in significant overhead, despite being
in a hotpath.
OK, alternatives...
o Reverse the roles of the idle and init threads during startup,
so that there is initially no idle thread.
However, there appears to be a fair amount of code that assumes
that there is always an idle thread.
o As above, but create both the init and idle threads early so
that there always is an idle thread that happens not to be
running during boot.
This would work, but seems to me to be uglier than the flag.
o Stop using idle_cpu() in rcu_check_callbacks(), instead keeping
a per-CPU "cpu_is_idle" variable that is set upon entry to the
various idle() loops and cleared upon exit. It would be OK to
take interrupts while "cpu_is_idle" is set.
The disadvantage here is that there are quite a few idle loops,
and it would be necessary to change them all. Missing one or
two could result in indefinite grace periods on the affected
systems.
o Drop idle as a quiescent state, as is already the case for
rcupreempt.
This would result in indefinite grace-period delays for
rcuclassic, but would actually work for rcutree. Except that
it would cause rcutree to IPI each and every idle CPU for
every grace period if !CONFIG_NO_HZ. I expect that this
overhead would far exceed that of the extra flag check in
rcu_check_callbacks().
So I still like the flag check. Any alternatives that I am missing?
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists