lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D6WN0T0DLMFJ.30GP099520IHT@bootlin.com>
Date: Wed, 08 Jan 2025 11:59:27 +0100
From: Théo Lebrun <theo.lebrun@...tlin.com>
To: Théo Lebrun <theo.lebrun@...tlin.com>, "Roger Quadros"
 <rogerq@...nel.org>, "Peter Chen" <peter.chen@...nel.org>, "Pawel Laszczak"
 <pawell@...ence.com>, "Greg Kroah-Hartman" <gregkh@...uxfoundation.org>,
 "Mathias Nyman" <mathias.nyman@...el.com>
Cc: Grégory Clement <gregory.clement@...tlin.com>, "Thomas
 Petazzoni" <thomas.petazzoni@...tlin.com>, <linux-usb@...r.kernel.org>,
 <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v6 4/5] xhci: introduce xhci->lost_power flag

On Wed Dec 18, 2024 at 6:49 PM CET, Théo Lebrun wrote:
> On Tue Dec 17, 2024 at 10:00 PM CET, Roger Quadros wrote:
> > On 13/12/2024 18:03, Théo Lebrun wrote:
> > > On Thu Dec 12, 2024 at 1:37 PM CET, Roger Quadros wrote:
> > >> On 10/12/2024 19:13, Théo Lebrun wrote:
> > >>> The XHCI_RESET_ON_RESUME quirk allows wrappers to signal that they
> > >>> expect a reset after resume. It is also used by some to enforce a XHCI
> > >>> reset on resume (see needs-reset-on-resume DT prop).
> > >>>
> > >>> Some wrappers are unsure beforehands if they will reset. Add a mechanism
> > >>> to signal *at resume* if power has been lost. Parent devices can set
> > >>> this flag, that defaults to false.
> > >>>
> > >>> The XHCI_RESET_ON_RESUME quirk still triggers a runtime_pm_get() on the
> > >>> controller. This is required as we do not know if a suspend will
> > >>> trigger a reset, so the best guess is to avoid runtime PM.
> > >>>
> > >>> Signed-off-by: Théo Lebrun <theo.lebrun@...tlin.com>
> > >>> ---
> > >>>  drivers/usb/host/xhci.c | 3 ++-
> > >>>  drivers/usb/host/xhci.h | 6 ++++++
> > >>>  2 files changed, 8 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> > >>> index 5ebde8cae4fc44cdb997b0f61314e309bda56c0d..ae2c8daa206a87da24d58a62b0a0485ebf68cdd6 100644
> > >>> --- a/drivers/usb/host/xhci.c
> > >>> +++ b/drivers/usb/host/xhci.c
> > >>> @@ -1017,7 +1017,8 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
> > >>>  
> > >>>  	spin_lock_irq(&xhci->lock);
> > >>>  
> > >>> -	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME || xhci->broken_suspend)
> > >>> +	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME ||
> > >>> +	    xhci->broken_suspend || xhci->lost_power)
> > >>>  		reinit_xhc = true;
> > >>>  
> > >>>  	if (!reinit_xhc) {
> > >>> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> > >>> index 4914f0a10cff42dbc1448dcf7908534d582c848e..32526df75925989d40cbe7d59a187c945f498a30 100644
> > >>> --- a/drivers/usb/host/xhci.h
> > >>> +++ b/drivers/usb/host/xhci.h
> > >>> @@ -1645,6 +1645,12 @@ struct xhci_hcd {
> > >>>  	unsigned		broken_suspend:1;
> > >>>  	/* Indicates that omitting hcd is supported if root hub has no ports */
> > >>>  	unsigned		allow_single_roothub:1;
> > >>> +	/*
> > >>> +	 * Signal from upper stacks that we lost power during system-wide
> > >>> +	 * suspend. Its default value is based on XHCI_RESET_ON_RESUME, meaning
> > >>> +	 * it is safe for wrappers to not modify lost_power at resume.
> > >>> +	 */
> > >>> +	unsigned                lost_power:1;
> > >>
> > >> I suppose this is private to XHCI driver and not legitimate to be accessed
> > >> by another driver after HCD is instantiated?
> > > 
> > > Yes it is private.
> > > 
> > >> Doesn't access to xhci_hcd need to be serialized via xhci->lock?
> > > 
> > > Good question. In theory maybe. In practice I don't see how
> > > cdns_host_resume(), called by cdns_resume(), could clash with anything
> > > else. I'll add that to be safe.
> > > 
> > >> Just curious, what happens if you don't include patch 4 and 5?
> > >> Is USB functionality still broken for you?
> > > 
> > > No it works fine. Patches 4+5 are only there to avoid the below warning.
> > > Logging "xHC error in resume" is a lie, so I want to avoid it.
> >
> > How is it a lie?
> > The XHCI controller did loose its save/restore state during a PM operation.
> > As far as XHCI is concerned this is an error. no?
>
> The `xhci->quirks & XHCI_RESET_ON_RESUME` is exactly the same thing;
> both the quirk and the flag we add have for purpose:
>
> 1. skipping this block
>
> 	if (!reinit_xhc) {
> 		retval = xhci_handshake(&xhci->op_regs->status,
> 					STS_CNR, 0, 10 * 1000 * 1000);
> 		// ...
> 		xhci_restore_registers(xhci);
> 		xhci_set_cmd_ring_deq(xhci);
> 		command = readl(&xhci->op_regs->command);
> 		command |= CMD_CRS;
> 		writel(command, &xhci->op_regs->command);
> 		if (xhci_handshake(&xhci->op_regs->status,
> 			      STS_RESTORE, 0, 100 * 1000)) {
> 			// ...
> 		}
> 	}
>
> 2. avoiding this warning:
>
> 	xhci_warn(xhci, "xHC error in resume, USBSTS 0x%x, Reinit\n", temp);
>
> I don't think the block skipped is important in resume duration (to be
> confirmed). But the xhci_warn() is not desired: we do not want to log
> warnings if we know it is expected.
>
> I'll think some more about it.

About this series, there were two discussions:

 - The desire to avoid putting the hardware init sequence of cdns3-ti
   inside  runtime_resume() as we don't do runtime PM.
   *That is fine and will be fixed for the next revision.*
   See [PATCH V6 2/5]: https://lore.kernel.org/lkml/8a1ed4be-c41c-46b6-ae25-33a6035b8c8d@kernel.org/

 - [PATCH V6 4/5] and [PATCH V6 5/5] are dedicated to avoiding a warning
   at XHCI resume on J7200:

      xhci_warn(xhci, "xHC error in resume, USBSTS 0x%x, Reinit\n", temp);

   https://lore.kernel.org/lkml/20241210-s2r-cdns-v6-4-28a17f9715a2@bootlin.com/
   https://lore.kernel.org/lkml/20241210-s2r-cdns-v6-5-28a17f9715a2@bootlin.com/

   Roger Quadros asked if we should not instead keep it, as there is
   indeed a reinit of the xHC block. I don't think we do: we don't want
   a warning at resume; IMO it would imply the reinit was unexpected.

   Proof is there is already a platform with a ->broken_suspend boolean
   that disables the warning even though there is a reinit. It doesn't
   log a warning as the reinit was expected.

   So we currently have:
    - xhci->broken_suspend: set at probe & implies the resume sequence
      won't work.
    - xhci->quirks & XHCI_RESET_ON_RESUME: set at probe & implies the
      controller reset during suspend.

   IIUC xhci->broken_suspend is NOT equivalent to "the controller reset
   during suspend". Else we wouldn't have both the broken_suspend flag
   and the XHCI_RESET_ON_RESUME quirk.

   In our case we want exactly the same thing as the
   XHCI_RESET_ON_RESUME quirk but updated at resume depending on what
   the wrapper driver detects.

   We could either:
   1. Update xhci->quirks at resume from upper layers.
   2. Introduce a xhci->lost_power flag. It would be strictly equivalent
      to the XHCI_RESET_ON_RESUME quirk BUT updated at resume from
      upper layers.

   @Mathias Nyman: what is your thought on the matter? Option (2) was
   the one taken in this series. Is there another option I am missing?

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ