[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dff9f24a-4aa2-45aa-920f-20876cf4ccbf@rowland.harvard.edu>
Date: Sun, 17 Nov 2024 16:02:49 -0500
From: Alan Stern <stern@...land.harvard.edu>
To: Markus Rechberger <linuxusb.ml@...dtek.de>
Cc: linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: Highly critical bug in XHCI Controller
On Sun, Nov 17, 2024 at 11:47:52PM +0800, Markus Rechberger wrote:
> On Sun, 2024-11-17 at 10:18 -0500, Alan Stern wrote:
> > On Sun, Nov 17, 2024 at 08:44:16PM +0800, Markus Rechberger wrote:
> > > Basically the issue comes from hub_port_connect.
> > >
> > > drivers/usb/core/hub.c
> > >
> > > hub_port_init returns -71 -EPROTO and jumps to loop
> > > https://github.com/torvalds/linux/blob/master/drivers/usb/core/hub.c#L5450
> > >
> > > I'd question if usb_ep0_reinit is really required in loop which is
> > > running following functions:
> This should only go down there in case an error happened earlier no?
Of course, since -EPROTO indicates there was an error.
> In case the hardware signed up correctly it should not even enter that
> code.
>
> My experience is just - reconnect the device in case an error happened
You can't reconnect some devices; they are permanently attached. There
are other reasons why reconnecting might not be practical.
> those
> workarounds did not work properly for the device I deal with (but yes
> that's
> why I'm asking - maybe someone else has different hardware with
> different
> experience).
I doubt we will hear from these people, because they will not realize
that anything was wrong.
In any case, it is generally recognized that the type of errors leading
to -EPROTO (bad CRC or no response, for instance) can be temporary or
intermittent. Retrying is an appropriate strategy.
> > > As a second issue I found in usb_reset_and_verify device
> > > https://github.com/torvalds/linux/blob/master/drivers/usb/core/hub.c#L6131
> > >
> > > ret = hub_port_init(parent_hub, udev, port1, i,
> > > &descriptor);
> > > if (ret >= 0 || ret == -ENOTCONN || ret == -ENODEV) {
> > > break;
> > > }
> > >
> > > hub_port_init can also return -71 / -EPROTO, the cases should be
> > > very
> > > rare when usb_reset_and_verify_device is triggered and that
> > > happens.
> >
> > If that happens, the loop which this code sits inside will simply
> > perform another iteration. That's what it's supposed to do, not an
> > issue at all.
> >
>
> It doesn't cause any issue yes but it's not correct either.
Why not? What's wrong with it?
> -EPROTO will be returned if I disconnect or short the device here. So
> initially someone
> thought he should check for -ENODEV/-ENOTCONN (which should also
> indicate that the device is gone), so -EPROTO should also be checked in
> that case.
> Otherwise just remove all those error checks.
-EPROTO does not necessarily indicate the device is gone; it indicates
there was a problem communicating with the device. The problem might be
caused by a disconnection, or it might be caused by temporary
electromagnetic interference (for example), or by something else.
Alan Stern
Powered by blists - more mailing lists