[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250605001838.yw633sgpn2fr65kc@synopsys.com>
Date: Thu, 5 Jun 2025 00:18:49 +0000
From: Thinh Nguyen <Thinh.Nguyen@...opsys.com>
To: Mathias Nyman <mathias.nyman@...ux.intel.com>
CC: Thinh Nguyen <Thinh.Nguyen@...opsys.com>,
"mathias.nyman@...el.com" <mathias.nyman@...el.com>,
Roy Luo <royluo@...gle.com>,
"quic_ugoswami@...cinc.com" <quic_ugoswami@...cinc.com>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"michal.pecio@...il.com" <michal.pecio@...il.com>,
"linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH v1 1/2] usb: xhci: Skip xhci_reset in xhci_resume if xhci
is being removed
On Wed, Jun 04, 2025, Mathias Nyman wrote:
> On 29.5.2025 4.17, Thinh Nguyen wrote:
> > On Mon, May 26, 2025, Mathias Nyman wrote:
> > > On 24.5.2025 2.06, Thinh Nguyen wrote:
> > > > Hi Mathias, Roy,
> > > >
> > > > On Thu, May 22, 2025, Roy Luo wrote:
> > > > > xhci_reset() currently returns -ENODEV if XHCI_STATE_REMOVING is
> > > > > set, without completing the xhci handshake, unless the reset completes
> > > > > exceptionally quickly. This behavior causes a regression on Synopsys
> > > > > DWC3 USB controllers with dual-role capabilities.
> > > > >
> > > > > Specifically, when a DWC3 controller exits host mode and removes xhci
> > > > > while a reset is still in progress, and then attempts to configure its
> > > > > hardware for device mode, the ongoing, incomplete reset leads to
> > > > > critical register access issues. All register reads return zero, not
> > > > > just within the xHCI register space (which might be expected during a
> > > > > reset), but across the entire DWC3 IP block.
> > > > >
> > > > > This patch addresses the issue by preventing xhci_reset() from being
> > > > > called in xhci_resume() and bailing out early in the reinit flow when
> > > > > XHCI_STATE_REMOVING is set.
> > > > >
> > > > > Cc: stable@...r.kernel.org
> > > > > Fixes: 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper")
> > > > > Suggested-by: Mathias Nyman <mathias.nyman@...el.com>
> > > > > Signed-off-by: Roy Luo <royluo@...gle.com>
> > > > > ---
> > > > > drivers/usb/host/xhci.c | 5 ++++-
> > > > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> > > > > index 90eb491267b5..244b12eafd95 100644
> > > > > --- a/drivers/usb/host/xhci.c
> > > > > +++ b/drivers/usb/host/xhci.c
> > > > > @@ -1084,7 +1084,10 @@ int xhci_resume(struct xhci_hcd *xhci, bool power_lost, bool is_auto_resume)
> > > > > xhci_dbg(xhci, "Stop HCD\n");
> > > > > xhci_halt(xhci);
> > > > > xhci_zero_64b_regs(xhci);
> > > > > - retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
> > > > > + if (xhci->xhc_state & XHCI_STATE_REMOVING)
> > > > > + retval = -ENODEV;
> > > > > + else
> > > > > + retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
> > > >
> > > > How can this prevent the xhc_state from changing while in reset? There's
> > > > no locking in xhci-plat.
> > >
> > > Patch 2/2, which is the revert of 6ccb83d6c497 prevents xhci_reset() from
> > > aborting due to xhc_state flags change.
> > >
> > > This patch makes sure xHC is not reset twice if xhci is resuming due to
> > > remove being called. (XHCI_STATE_REMOVING is set).
> >
> > Wouldn't it still be possible for xhci to be removed in the middle of
> > reset on resume? The watchdog may still timeout afterward if there's an
> > issue with reset right?
> >
>
> Probably yes, but that problem is the same if we only revert 6ccb83d6c497.
>
> > > Why intentionally bring back the Qcom watchdog issue by only reverting
> > > 6ccb83d6c497 ?. Can't we solve both in one go?
> >
> > I feel that the fix is doesn't cover all the scenarios, that's why I
> > suggest the revert for now and wait until the fix is properly tested
> > before applying it to stable?
>
> Ok, we have different views on this.
>
> I think we should avoid causing as much known regression as possible even
> if the patch might not cover all scenarios.
>
> By reverting 6ccb83d6c497 we fix a SNPS DWC3 regression, but at the same
> time bring back the Qcom issue, so cause another regression.
>
> We can avoid the main part or the Qcom regression by adding this patch as
> issue is with (long) xhci reset during resume if xhci is being removed, and
> driver always resumes xhci during ->remove callback.
>
> If we discover the patch is not perfect then we fix it
>
Ok. Fair enough.
BR,
Thinh
Powered by blists - more mailing lists