lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 7 Dec 2017 11:47:07 +0000
From:   James Hogan <james.hogan@...s.com>
To:     Daniel Lezcano <daniel.lezcano@...aro.org>
CC:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        <linux-pm@...r.kernel.org>,
        Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        Ralf Baechle <ralf@...ux-mips.org>,
        Paul Burton <paul.burton@...s.com>,
        <linux-kernel@...r.kernel.org>, <linux-mips@...ux-mips.org>
Subject: Re: [RFC PATCH] cpuidle/coupled: Handle broadcast enter failures

On Thu, Dec 07, 2017 at 12:17:25PM +0100, Daniel Lezcano wrote:
> On 05/12/2017 23:55, James Hogan wrote:
> > From: James Hogan <jhogan@...nel.org>
> > 
> > If the hrtimer based broadcast tick device is in use, the enabling of
> > broadcast ticks by cpuidle may fail when the next broadcast event is
> > brought forward to match the next event due on the local tick device,
> > This is because setting the next event may migrate the hrtimer based
> > broadcast tick to the current CPU, which then makes
> > broadcast_needs_cpu() fail.
> > 
> > This isn't normally a problem as cpuidle handles it by falling back to
> > the deepest idle state not needing broadcast ticks, however when coupled
> > cpuidle is used it can happen after the coupled CPUs have all agreed on
> > a particular idle state, resulting in only one of the CPUs falling back
> > to a shallower state, and an attempt to couple two completely different
> > idle states which may not be safe.
> > 
> > Therefore extend cpuidle_enter_state_coupled() to be able to handle the
> > enabling of broadcast ticks directly, so that a failure can be detected
> > at the higher level, and all coupled CPUs can be made to fall back to
> > the same idle state.
> > 
> > This takes place after the idle state has been initially agreed. Each
> > CPU will then attempt to enable broadcast ticks (if necessary), and upon
> > failure it will update the requested_state[] array before a second
> > coupled parallel barrier so that all coupled CPUs can recognise the
> > change.
> > 
> > Signed-off-by: James Hogan <jhogan@...nel.org>
> > Cc: "Rafael J. Wysocki" <rjw@...ysocki.net>
> > Cc: Daniel Lezcano <daniel.lezcano@...aro.org>
> > Cc: Frederic Weisbecker <fweisbec@...il.com>
> > Cc: Thomas Gleixner <tglx@...utronix.de>
> > Cc: Ingo Molnar <mingo@...nel.org>
> > Cc: Preeti U Murthy <preeti@...ux.vnet.ibm.com>
> > Cc: Ralf Baechle <ralf@...ux-mips.org>
> > Cc: Paul Burton <paul.burton@...s.com>
> > Cc: linux-pm@...r.kernel.org
> > Cc: linux-kernel@...r.kernel.org
> > Cc: linux-mips@...ux-mips.org
> > ---
> > Is this an acceptable approach in principle?
> > 
> > Better/cleaner ideas to handle this are most welcome.
> > 
> > This doesn't directly address the problem that some of the time it won't
> > be possible to enter deeper idle states because of the hrtimer based
> > broadcast tick's affinity. The actual case I'm looking at is on MIPS
> > with cpuidle-cps, where the first core cannot (currently) go into a deep
> > idle state requiring broadcast ticks, so it'd be nice if the hrtimer
> > based broadcast tick device could just stay on core 0.
> > ---
> 
> Before commenting this patch, I would like to understand why the couple
> idle state is needed for the MIPS, what in the architecture forces the
> usage of the couple idle state?

Hardware multithreading.

Each physical core may have more than one hardware thread (VPE or VP in
MIPS lingo), each of which appears as a separate CPU to Linux. The lower
power states are all effective at the core level though:
- non-coherent wait - the hardware threads share physical caches, so
  coherency can only be turned off when all hardware threads are in a
  safe state, else they (1) wouldn't be coherent with the rest of the
  system and (2) could bring stuff into the cache which isn't kept
  coherent, requiring I presume a second cache flush.
- clock gated - must go non-coherent first, and applies to the whole
  core and all the physical resources shared by the VP(E)s.
- power gated - again must go non-coherent first, and applies to the
  whole core and all the physical resources shared by the VP(E)s.

> 
> The hrtimer broadcast mechanism is only needed if there isn't a backup
> timer outside of the idle state's power domain. That's the case on this
> platform?

I believe there is an external timer, and I believe we recommend
customers implement one for use as a clocksource anyway (in case of
frequency scaling), but on this particular platform the driver isn't
upstream yet.

If its something that shouldn't be supported in Linux, perhaps a simple
WARN_ON is the better approach (i.e. if the broadcast tick can't be
enabled in the current place and its a coupled idle state), though it
does at least seem to work now with this and a couple of other patches
(though I haven't been able to take any power measurements yet).

Cheers
James

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ