lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20140213142045.GC14383@localhost.localdomain>
Date:	Thu, 13 Feb 2014 15:20:47 +0100
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Viresh Kumar <viresh.kumar@...aro.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Lists linaro-kernel <linaro-kernel@...ts.linaro.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Linaro Networking <linaro-networking@...aro.org>,
	Kevin Hilman <khilman@...aro.org>
Subject: Re: [QUERY]: Is using CPU hotplug right for isolating CPUs?

On Tue, Feb 11, 2014 at 02:22:43PM +0530, Viresh Kumar wrote:
> On 28 January 2014 18:53, Frederic Weisbecker <fweisbec@...il.com> wrote:
> > No, when a single task is running on a full dynticks CPU, the tick is supposed to run
> > every seconds. I'm actually suprised it doesn't happen in your traces, did you tweak
> > something specific?
> 
> Why do we need this 1 second tick currently? And what will happen if I
> hotunplug that
> CPU and get it back? Would the timer for tick move away from CPU in
> question? I see
> that when I have changed this 1sec stuff to 300 seconds. But what
> would be impact
> of that? Will things still work normally?

So the problem resides in the gazillions accounting maintained in scheduler_tick() and
current->sched_class->task_tick().

The scheduler correctness depends on these to be updated regularly. If you deactivate
or increase the delay with very high values, the result is unpredictable. Just expect that
at least some scheduler feature will behave randomly, like load balancing for example or
simply local fairness issues.

So we have that 1 Hz max that makes sure that things are moving forward while keeping
a rate that should be still nice for HPC workloads. But we certainly want to find a
way to remove the need for any tick altogether for extreme real time workloads which
need guarantees rather than just optimizations.

I see two potential solutions for that:

1) Rework the scheduler accounting such that it is safe against full dynticks. That
was the initial plan but it's scary. The scheduler accountings is a huge maze. And I'm not
sure it's actually worth the complication.

2) Offload the accounting. For example we could imagine that the timekeeping could handle the
task_tick() calls on behalf of the full dynticks CPUs. At a small rate like 1 Hz.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ