[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130920164201.GB30381@localhost.localdomain>
Date: Fri, 20 Sep 2013 11:42:03 -0500
From: Frederic Weisbecker <fweisbec@...il.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Christoph Lameter <cl@...ux.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Gilad Ben-Yossef <gilad@...yossef.com>,
Tejun Heo <tj@...nel.org>, John Stultz <johnstul@...ibm.com>,
Mike Frysinger <vapier@...too.org>,
Minchan Kim <minchan.kim@...il.com>,
Hakan Akkan <hakanakkan@...il.com>,
Max Krasnyansky <maxk@...lcomm.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Linux-MM <linux-mm@...ck.org>
Subject: Re: RFC vmstat: On demand vmstat threads
On Fri, Sep 20, 2013 at 12:41:02PM +0200, Thomas Gleixner wrote:
> On Thu, 19 Sep 2013, Christoph Lameter wrote:
> > On Thu, 19 Sep 2013, Thomas Gleixner wrote:
> >
> > > The vmstat accounting is not the only thing which we want to delegate
> > > to dedicated core(s) for the full NOHZ mode.
> > >
> > > So instead of playing broken games with explicitly not exposed core
> > > code variables, we should implement a core code facility which is
> > > aware of the NOHZ details and provides a sane way to delegate stuff to
> > > a certain subset of CPUs.
> >
> > I would be happy to use such a facility. Otherwise I would just be adding
> > yet another kernel option or boot parameter I guess.
>
> Uuurgh, no.
>
> The whole delegation stuff is necessary not just for vmstat. We have
> the same issue for scheduler stats and other parts of the kernel, so
> we are better off in having a core facility to schedule such functions
> in consistency with the current full NOHZ state.
Agreed.
So we have the choice between having this performed from callers in the
kernel with functions that enforce the affinity of some asynchronous tasks,
like "schedule_on_timekeeper()" or "schedule_on_housekeeers()" with workqueues for example.
Or we can add interface to define the affinity of such things from userspace, at the
risk of exposing some kernel details like workqueues or timers internal callback names.
Oh and may be this must stay flexible enough to handle dispatched housekeeping in the future.
Like on big NUMA machines that want to dispatch some part of the housekeeping on each
NUMA nodes for close running CPU. Although I don't have any detail in mind for that.
I've also been thinking of some flag for defferable timers to be also user defferable.
But I expect too much overhead to maintain that on kernel/user boundaries. And eventually
the issues we have go beyond just user/kernel ring conditions.
Just random thoughts.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists