[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFTL4hz2RjpWFJRMXu46ESKHXzQn1EN0bhjNs_svYgV5fFX2rQ@mail.gmail.com>
Date: Wed, 9 Nov 2016 11:07:56 +0000
From: Frederic Weisbecker <fweisbec@...il.com>
To: Chris Metcalf <cmetcalf@...lanox.com>
Cc: Gilad Ben Yossef <giladb@...lanox.com>,
Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Rik van Riel <riel@...hat.com>, Tejun Heo <tj@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Christoph Lameter <cl@...ux.com>,
Viresh Kumar <viresh.kumar@...aro.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
Andy Lutomirski <luto@...capital.net>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Francis Giraldeau <francis.giraldeau@...il.com>,
Andi Kleen <andi@...stfloor.org>,
Arnd Bergmann <arnd@...db.de>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: task isolation discussion at Linux Plumbers
2016-11-05 4:04 GMT+00:00 Chris Metcalf <cmetcalf@...lanox.com>:
> A bunch of people got together this week at the Linux Plumbers
> Conference to discuss nohz_full, task isolation, and related stuff.
> (Thanks to Thomas for getting everyone gathered at one place and time!)
>
> Here are the notes I took; I welcome any corrections and follow-up.
>
Thanks for that report Chris!
> == rcu_nocbs ==
>
> We started out by discussing this option. It is automatically enabled
> by nohz_full, but we spent a little while side-tracking on the
> implementation of one kthread per rcu flavor per core. The suggestion
> was made (by Peter or Andy; I forget) that each kthread could handle
> all flavors per core by using a dedicated worklist. It certainly
> seems like removing potentially dozens or hundreds of kthreads from
> larger systems will be a win if this works out.
>
> Paul said he would look into this possibility.
Sounds good.
>
>
> == Remote statistics ==
>
> We discussed the possibility of remote statistics gathering, i.e. load
> average etc. The idea would be that we could have housekeeping
> core(s) periodically iterate over the nohz cores to load their rq
> remotely and do update_current etc. Presumably it should be possible
> for a single housekeeping core to handle doing this for all the
> nohz_full cores, as we only need to do it quite infrequently.
>
> Thomas suggested that this might be the last remaining thing that
> needed to be done to allow disabling the current behavior of falling
> back to a 1 Hz clock in nohz_full.
>
> I believe Thomas said he had a patch to do this already.
>
There are also some other details among update_curr to take care of,
but that's certainly a big piece of it.
I had wished we could find a solution that doesn't involve remote
accounting but at least it could be a first step.
I have let that idea rotting for too long, I need to get my hands into
it for good.
> == Disabling the dyn tick ==
>
> One issue that the current task isolation patch series encounters is
> when we request disabling the dyntick, but it doesn't happen. At the
> moment we just wait until the the tick is properly disabled, by
> busy-waiting in the kernel (calling schedule etc as needed). No one
> is particularly fond of this scheme. The consensus seems to be to try
> harder to figure out what is going on, fix whatever problems exist,
> then consider it a regression going forward if something causes the
> dyntick to become difficult to disable again in the future. I will
> take a look at this and try to gather more data on if and when this is
> happening in 4.9.
>
We could enhance dynticks tracing, expand the tick stop failure codes
for example in order to report more details about what's going on.
> == Missing oneshot_stopped callbacks ==
>
> I raised the issue that various clock_event_device sources don't
> always support oneshot_stopped, which can cause an additional
> final interrupt to occur after the timer infrastructure believes the
> interrupt has been stopped. I have patches to fix this for tile and
> arm64 in my patch series; Thomas volunteered to look at adding
> equivalent support for x86.
>
>
> Many thanks to all those who participated in the discussion.
> Frederic, we wished you had been there!
I wish I had too!
Powered by blists - more mailing lists