[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130709132359.GF16780@linux.vnet.ibm.com>
Date: Tue, 9 Jul 2013 06:23:59 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
dipankar@...ibm.com, akpm@...ux-foundation.org,
mathieu.desnoyers@...ymtl.ca, josh@...htriplett.org,
niv@...ibm.com, tglx@...utronix.de, rostedt@...dmis.org,
dhowells@...hat.com, edumazet@...gle.com, darren@...art.com,
fweisbec@...il.com, sbw@....edu
Subject: Re: [PATCH RFC nohz_full 2/7] nohz_full: Add rcu_dyntick data for
scalable detection of all-idle state
On Tue, Jul 09, 2013 at 11:37:28AM +0200, Peter Zijlstra wrote:
> On Mon, Jul 08, 2013 at 06:30:01PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
> >
> > This commit adds fields to the rcu_dyntick structure that are used to
> > detect idle CPUs. These new fields differ from the existing ones in
> > that the existing ones consider a CPU executing in user mode to be idle,
> > where the new ones consider CPUs executing in user mode to be busy.
> > The handling of these new fields is otherwise quite similar to that for
> > the exiting fields. This commit also adds the initialization required
> > for these fields.
> >
> > So, why is usermode execution treated differently, with RCU considering
> > it a quiescent state equivalent to idle, while in contrast the new
> > full-system idle state detection considers usermode execution to be
> > non-idle?
> >
> > It turns out that although one of RCU's quiescent states is usermode
> > execution, it is not a full-system idle state. This is because the
> > purpose of the full-system idle state is not RCU, but rather determining
> > when accurate timekeeping can safely be disabled. Whenever accurate
> > timekeeping is required in a CONFIG_NO_HZ_FULL kernel, at least one
> > CPU must keep the scheduling-clock tick going. If even one CPU is
> > executing in user mode, accurate timekeeping is requires, particularly for
> > architectures where gettimeofday() and friends do not enter the kernel.
> > Only when all CPUs are really and truly idle can accurate timekeeping be
> > disabled, allowing all CPUs to turn off the scheduling clock interrupt,
> > thus greatly improving energy efficiency.
> >
> > This naturally raises the question "Why is this code in RCU rather than in
> > timekeeping?", and the answer is that RCU has the data and infrastructure
> > to efficiently make this determination.
>
> but but but but... why doesn't the regular nohz code qualify? I'd think
> that too would be tracking pretty much the same things, no?
The regular nohz code is identifying which CPUs are idle, but is doing
so on a CPU-by-CPU basis. Before turning off system-wide timekeeping,
we need to know that -all- of the CPUs are idle. The regular nohz code
does not do this.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists