[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <34d7d7c7b5914674b55a6dc21ced1190@realtek.com>
Date: Wed, 18 Oct 2023 11:40:50 +0000
From: Hayes Wang <hayeswang@...ltek.com>
To: Doug Anderson <dianders@...omium.org>
CC: Jakub Kicinski <kuba@...nel.org>,
"David S . Miller"
<davem@...emloft.net>,
Alan Stern <stern@...land.harvard.edu>,
Simon Horman
<horms@...nel.org>, Edward Hill <ecgh@...omium.org>,
Laura Nao
<laura.nao@...labora.com>,
"linux-usb@...r.kernel.org"
<linux-usb@...r.kernel.org>,
Grant Grundler <grundler@...omium.org>,
Bjørn Mork <bjorn@...k.no>,
Eric Dumazet
<edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH v3 5/5] r8152: Block future register access if register access fails
Doug Anderson <dianders@...omium.org>
> Sent: Tuesday, October 17, 2023 10:17 PM
[...]
> > That is, the loop would be broken when the fail rate of the control transfer is high or low enough.
> > Otherwise, you would queue a usb reset again and again.
> > For example, if the fail rate of the control transfer is 10% ~ 60%,
> > I think you have high probability to keep the loop continually.
> > Would it never happen?
>
> Actually, even with a failure rate of 10% I don't think you'll end up
> with a fully continuous loop, right? All you need is to get 3 failures
> in a row in rtl8152_get_version() to get out of the loop. So with a
> 10% failure rate you'd unbind/bind 1000 times (on average) and then
> (finally) give up. With a 50% failure rate I think you'd only
> unbind/bind 8 times on average, right? Of course, I guess 1000 loops
> is pretty close to infinite.
>
> In any case, we haven't actually seen hardware that fails like this.
> We've seen failure rates that are much much lower and we can imagine
> failure rates that are 100% if we're got really broken hardware. Do
> you think cases where failure rates are middle-of-the-road are likely?
That is my question, too.
I don't know if something would cause the situation, either.
This is out of my knowledge.
I am waiting for the professional answers, too.
A lot of reasons may cause the fail of the control transfer.
I don't have all of the real situation to analyze them.
Therefore, what I could do is to assume different situations.
You could say my hypotheses are unreasonable.
However, I have to tell you what I worry.
> I would also say that nothing we can do can perfectly handle faulty
> hardware. If we're imagining theoretical hardware, we could imagine
> theoretical hardware that de-enumerated itself and re-enumerated
> itself every half second because the firmware on the device crashed or
> some regulator kept dropping. This faulty hardware would also cause an
> infinite loop of de-enumeration and re-enumeration, right?
>
> Presumably if we get into either case, the user will realize that the
> hardware isn't working and will unplug it from the system. While the
Some of our devices are onboard. That is, they couldn't be unplugged.
That is why I have to consider a lot of situations.
> system is doing the loop of trying to enumerate the hardware, it will
> be taking up a bunch of extra CPU cycles but (I believe) it won't be
> fully locked up or anything. The machine will still function and be
> able to do non-Ethernet activities, right? I would say that the worst
> thing about this state would be that it would stress corner cases in
> the reset of the USB subsystem, possibly ticking bugs.
>
> So I guess I would summarize all the above as:
>
> If hardware is broken in just the right way then this patch could
> cause a nearly infinite unbinding/rebinding of the r8152 driver.
> However:
>
> 1. It doesn't seem terribly likely for hardware to be broken in just this way.
>
> 2. We haven't seen hardware broken in just this way.
>
> 3. Hardware broken in a slightly different way could cause infinite
> unbinding/rebinding even without this patch.
>
> 4. Infinite unbinding/rebinding of a USB adapter isn't great, but not
> the absolute worst thing.
It is fine if everyone agrees these.
Best Regards,
Hayes
Powered by blists - more mailing lists