lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bf6d3b1f-f703-4a25-833e-972a44a04114@sholland.org>
Date:   Mon, 21 Nov 2022 18:45:25 -0600
From:   Samuel Holland <samuel@...lland.org>
To:     Palmer Dabbelt <palmer@...osinc.com>,
        Conor Dooley <conor@...nel.org>
Cc:     anup@...infault.org, rafael@...nel.org, daniel.lezcano@...aro.org,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>, aou@...s.berkeley.edu,
        linux-pm@...r.kernel.org, linux-riscv@...ts.infradead.org,
        linux-kernel@...r.kernel.org, linux@...osinc.com
Subject: Re: [PATCH] cpuidle: riscv-sbi: Stop using non-retentive suspend

Hi Palmer,

On 11/21/22 14:56, Palmer Dabbelt wrote:
> From: Palmer Dabbelt <palmer@...osinc.com>
> 
> As per [1], whether or not the core can wake up from non-retentive
> suspend is a platform-specific detail.  We don't have any way to encode
> that, so just stop using them until we've sorted that out.

We do have *exactly* a way to encode that. Specifically, the existence
or non-existence of a non-retentive CPU suspend state in the DT.

If a hart has no way of resuming from non-retentive suspend (i.e. a
complete lack of interrupt delivery in non-retentive suspend), then it
makes zero sense to advertise such a suspend state in the DT. Therefore,
if the state exists in the DT, you can assume there is *some* interrupt
that can wake up the hart. And I would extend that to assume at least
one of those wakeup-capable interrupts is a timer interrupt, although
not necessarily the architectural timer interrupt.

> Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98#issuecomment-1288564687

This comment refers specifically to the architectural timer interrupt,
not interrupts more generally.

> Fixes: 6abf32f1d9c5 ("cpuidle: Add RISC-V SBI CPU idle driver")
> Signed-off-by: Palmer Dabbelt <palmer@...osinc.com>
> 
> ---
> 
> This should allow us to revert 232ccac1bd9b ("clocksource/drivers/riscv:
> Events are stopped during CPU suspend"), which fixes suspend on the D1
> but breaks timers everywhere.

I understand that reverting 232ccac1bd9b is the easiest way to fix the
issues you and others are seeing. I do not have any functioning RISC-V
hardware with SMP, so it is hard for me to help debug the root issue in
the Linux timer code. I do not know why turning on the C3STOP flag
breaks RCU stall detection or clock_nanosleep(), but I agree that
breakage should not happen.

So while I still think 232ccac1bd9b is the right change to make from a
"following the spec" standpoint, I am okay with reverting it for
pragmatic reasons. Since the D1 has another timer driver that is
currently used in preference to the RISC-V/SBI timer triver, reverting
232ccac1bd9b does not break non-retentive suspend for the D1.

But please do not make the change below. It is unnecessarily broad, and
will break something that works fine right now. If the DT advertises a
CPU suspend state that cannot be woken up from at all, then that is a
bug in the DT. Linux should not try to work around that.

So revert 232ccac1bd9b for now, and we can figure out what to do about
the DT property, but please do not merge this.

Regards,
Samuel

> ---
>  drivers/cpuidle/cpuidle-riscv-sbi.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> index 05fe2902df9a..9d1063a54495 100644
> --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> @@ -214,6 +214,17 @@ static bool sbi_suspend_state_is_valid(u32 state)
>  	if (state > SBI_HSM_SUSPEND_NON_RET_DEFAULT &&
>  	    state < SBI_HSM_SUSPEND_NON_RET_PLATFORM)
>  		return false;
> +
> +	/*
> +	 * Whether or not RISC-V systems deliver interrupts to harts in a
> +	 * non-retentive suspend state is a platform-specific detail.  This can
> +	 * leave the hart unable to wake up, so just mark these states as
> +	 * unsupported until we have a mechanism to expose these
> +	 * platform-specific details to Linux.
> +	 */
> +	if (state & SBI_HSM_SUSP_NON_RET_BIT)
> +		return false;
> +
>  	return true;
>  }
>  

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ