[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090507195142.GH6693@linux.vnet.ibm.com>
Date: Thu, 7 May 2009 12:51:42 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Christoph Lameter <cl@...ux.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Alok Kataria <akataria@...are.com>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
the arch/x86 maintainers <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
"alan@...rguk.ukuu.org.uk" <alan@...rguk.ukuu.org.uk>,
anton@...ba.org
Subject: Re: [PATCH] x86: Reduce the default HZ value
On Thu, May 07, 2009 at 01:51:58PM -0400, Christoph Lameter wrote:
> On Thu, 7 May 2009, Paul E. McKenney wrote:
>
> > On Thu, May 07, 2009 at 01:20:29PM -0400, Christoph Lameter wrote:
> > > On Thu, 7 May 2009, Peter Zijlstra wrote:
> > >
> > > > Another user is RCU, the grace period is tick driven, growing these
> > > > ticks by a factor 50 or so might require some tinkering with forced
> > > > grace periods when we notice our batch queues getting too long.
> > >
> > > One could also schedule RCU via hrtimers with a large fuzz period?
> >
> > You could, but then you would still have a periodic interrupt introducing
> > jitter into your HPC workload. The approach I suggested allows RCU to be
> > happy with no periodic interrupts on any CPU that has only one runnable
> > task that is a CPU-bound user-level task (in addition to the idle task,
> > of course).
>
> Sounds good.
>
> An HPC workload typically has minimal kernel interaction. RCU would
> only need to run once and then the system would be quiet.
Peter Z's post leads me to believe that there might be dragons in
this approach that I am blissfully unaware of. However, here is what
would have to happen from an RCU perspective, in case it helps:
o This new mode needs to imply CONFIG_NO_HZ.
o When a given CPU is transitioning into tickless mode, invoke
rcu_enter_nohz(). This already happens for dynticks-idle,
this would be a dynticks-CPU-bound-usermode-task.
Note that CONFIG_NO_HZ kernels already invokes rcu_enter_nohz()
from tick_nohz_stop_sched_tick(), and many of the things in
tick_nohz_stop_sched_tick() would need to be done in this case
as well.
o When a given CPU is transitioning out of tickless mode, invoke
rcu_exit_nohz(). Again, this already happens for dynticks-idle.
Note that CONFIG_NO_HZ kernels already invoke rcu_exit_nohz()
from tick_nohz_restart_sched_tick(), which does other stuff that
would be required in your case as well.
o When a given CPU in tickless mode transitions into the kernel
via a system call or trap, invoke rcu_irq_enter(). Note that
rcu_irq_enter() is already invoked on irq entry if CONFIG_NO_HZ.
NMIs are also already handled via rcu_nmi_enter().
o When a given CPU in tickless mode transitions out of the kernel
from a system call or trap, invoke rcu_irq_exit(). Note that
rcu_irq_exit() is already invoked on irq exit if CONFIG_NO_HZ.
NMIs are also already handled via rcu_nmi_exit().
Then RCU would know that any CPU running a CPU-bound user-mode task
need not be consulted when working out when a grace period ends, since
user-mode code cannot contain kernel-mode RCU read-side critical sections.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists