lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 28 Jun 2021 11:25:48 +0900
From:   Jung Daehwan <dh10.jung@...sung.com>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     Mathias Nyman <mathias.nyman@...el.com>, linux-usb@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: usb: host: Reduce xhci_handshake timeout in xhci_reset

On Tue, Jun 22, 2021 at 09:56:20PM +0200, Greg Kroah-Hartman wrote:
> On Tue, Jun 22, 2021 at 08:24:56PM +0900, Daehwan Jung wrote:
> > It seems 10 secs timeout is too long in general case. A core would wait for
> > 10 secs without doing other task and it can be happended on every device.
> 
> Only if the handshake does not come back sooner, right?

Yes, right.

> What is causing your device to timeout here?

Host Controller doesn't respond handshake. I don't know why and I ask HW team
to debug it.

> > It's better to reduce timeout for general case and use new quirk if needed.
>
> What new quirk?

I mean someone can add new quirk if one still needs long timeout. I guess 1 sec
seems enough but there're many kinds of devices.

> And why 1 second, where did that number come from?

It was 250 msecs before changed to 10 secs. There's no required minimum time in
xhci specification. 1 second is estimated number and it works well on my device.

> >
> > Signed-off-by: Daehwan Jung <dh10.jung@...sung.com>
> > ---
> >  drivers/usb/host/xhci.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> > index 9248ce8..0a1b6be 100644
> > --- a/drivers/usb/host/xhci.c
> > +++ b/drivers/usb/host/xhci.c
> > @@ -196,7 +196,7 @@ int xhci_reset(struct xhci_hcd *xhci)
> >  		udelay(1000);
> >
> >  	ret = xhci_handshake(&xhci->op_regs->command,
> > -			CMD_RESET, 0, 10 * 1000 * 1000);
> > +			CMD_RESET, 0, 1 * 1000 * 1000);
> >  	if (ret)
> >  		return ret;
> >
> > @@ -210,7 +210,7 @@ int xhci_reset(struct xhci_hcd *xhci)
> >  	 * than status until the "Controller Not Ready" flag is cleared.
> >  	 */
> >  	ret = xhci_handshake(&xhci->op_regs->status,
> > -			STS_CNR, 0, 10 * 1000 * 1000);
> > +			STS_CNR, 0, 1 * 1000 * 1000);
>
> With this change, what "goes faster"?  What is currently causing
> problems with your host controller that this timeout value actually
> matters?  Why is it failing?

I guess the root cause of it is from host controller, which it is HW.
Our HW engineer has been debugging it, but I haven't get any clue till now.
However, I think 10 secs timeout is too long and it can cause system problem.
That's why I want to change timeout value. A CPU core would not do anything but
waiting xhci reset for 10 secs with disabling irq like below.

usb_remove_hcd -> xhci_stop -> xhci_reset -> xhci_handshake

static void xhci_stop(struct usb_hcd *hcd)
{
        u32 temp;
        struct xhci_hcd *xhci = hcd_to_xhci(hcd);

        mutex_lock(&xhci->mutex);

        /* Only halt host and free memory after both hcds are removed */
        if (!usb_hcd_is_primary_hcd(hcd)) {
                mutex_unlock(&xhci->mutex);
                return;
        }

        xhci_dbc_exit(xhci);

        spin_lock_irq(&xhci->lock);           -> disable IRQ
        xhci->xhc_state |= XHCI_STATE_HALTED;
        xhci->cmd_ring_state = CMD_RING_STATE_STOPPED;
        xhci_halt(xhci);
        xhci_reset(xhci);                     -> 10 seconds timeout!
        spin_unlock_irq(&xhci->lock);

Best Regards,
Jung Daehwan

> thanks,
> 
> greg k-h
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ