linux-kernel - Re: [QUERY]: Is using CPU hotplug right for isolating CPUs?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Thu, 13 Feb 2014 15:20:47 +0100
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Viresh Kumar <viresh.kumar@...aro.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Lists linaro-kernel <linaro-kernel@...ts.linaro.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Linaro Networking <linaro-networking@...aro.org>,
	Kevin Hilman <khilman@...aro.org>
Subject: Re: [QUERY]: Is using CPU hotplug right for isolating CPUs?

On Tue, Feb 11, 2014 at 02:22:43PM +0530, Viresh Kumar wrote:
> On 28 January 2014 18:53, Frederic Weisbecker <fweisbec@...il.com> wrote:
> > No, when a single task is running on a full dynticks CPU, the tick is supposed to run
> > every seconds. I'm actually suprised it doesn't happen in your traces, did you tweak
> > something specific?
> 
> Why do we need this 1 second tick currently? And what will happen if I
> hotunplug that
> CPU and get it back? Would the timer for tick move away from CPU in
> question? I see
> that when I have changed this 1sec stuff to 300 seconds. But what
> would be impact
> of that? Will things still work normally?

So the problem resides in the gazillions accounting maintained in scheduler_tick() and
current->sched_class->task_tick().

The scheduler correctness depends on these to be updated regularly. If you deactivate
or increase the delay with very high values, the result is unpredictable. Just expect that
at least some scheduler feature will behave randomly, like load balancing for example or
simply local fairness issues.

So we have that 1 Hz max that makes sure that things are moving forward while keeping
a rate that should be still nice for HPC workloads. But we certainly want to find a
way to remove the need for any tick altogether for extreme real time workloads which
need guarantees rather than just optimizations.

I see two potential solutions for that:

1) Rework the scheduler accounting such that it is safe against full dynticks. That
was the initial plan but it's scary. The scheduler accountings is a huge maze. And I'm not
sure it's actually worth the complication.

2) Offload the accounting. For example we could imagine that the timekeeping could handle the
task_tick() calls on behalf of the full dynticks CPUs. At a small rate like 1 Hz.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/