netdev - RE: [PATCH v3 5/5] r8152: Block future register access if register access fails

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <34d7d7c7b5914674b55a6dc21ced1190@realtek.com>
Date: Wed, 18 Oct 2023 11:40:50 +0000
From: Hayes Wang <hayeswang@...ltek.com>
To: Doug Anderson <dianders@...omium.org>
CC: Jakub Kicinski <kuba@...nel.org>,
        "David S . Miller"
	<davem@...emloft.net>,
        Alan Stern <stern@...land.harvard.edu>,
        Simon Horman
	<horms@...nel.org>, Edward Hill <ecgh@...omium.org>,
        Laura Nao
	<laura.nao@...labora.com>,
        "linux-usb@...r.kernel.org"
	<linux-usb@...r.kernel.org>,
        Grant Grundler <grundler@...omium.org>,
        Bjørn Mork <bjorn@...k.no>,
        Eric Dumazet
	<edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH v3 5/5] r8152: Block future register access if register access fails

Doug Anderson <dianders@...omium.org>
> Sent: Tuesday, October 17, 2023 10:17 PM
[...]
> > That is, the loop would be broken when the fail rate of the control transfer is high or low enough.
> > Otherwise, you would queue a usb reset again and again.
> > For example, if the fail rate of the control transfer is 10% ~ 60%,
> > I think you have high probability to keep the loop continually.
> > Would it never happen?
> 
> Actually, even with a failure rate of 10% I don't think you'll end up
> with a fully continuous loop, right? All you need is to get 3 failures
> in a row in rtl8152_get_version() to get out of the loop. So with a
> 10% failure rate you'd unbind/bind 1000 times (on average) and then
> (finally) give up. With a 50% failure rate I think you'd only
> unbind/bind 8 times on average, right? Of course, I guess 1000 loops
> is pretty close to infinite.
> 
> In any case, we haven't actually seen hardware that fails like this.
> We've seen failure rates that are much much lower and we can imagine
> failure rates that are 100% if we're got really broken hardware. Do
> you think cases where failure rates are middle-of-the-road are likely?

That is my question, too.
I don't know if something would cause the situation, either.
This is out of my knowledge.
I am waiting for the professional answers, too.

A lot of reasons may cause the fail of the control transfer.
I don't have all of the real situation to analyze them.
Therefore, what I could do is to assume different situations.
You could say my hypotheses are unreasonable.
However, I have to tell you what I worry.

> I would also say that nothing we can do can perfectly handle faulty
> hardware. If we're imagining theoretical hardware, we could imagine
> theoretical hardware that de-enumerated itself and re-enumerated
> itself every half second because the firmware on the device crashed or
> some regulator kept dropping. This faulty hardware would also cause an
> infinite loop of de-enumeration and re-enumeration, right?
> 
> Presumably if we get into either case, the user will realize that the
> hardware isn't working and will unplug it from the system. While the

Some of our devices are onboard. That is, they couldn't be unplugged.
That is why I have to consider a lot of situations.

> system is doing the loop of trying to enumerate the hardware, it will
> be taking up a bunch of extra CPU cycles but (I believe) it won't be
> fully locked up or anything. The machine will still function and be
> able to do non-Ethernet activities, right? I would say that the worst
> thing about this state would be that it would stress corner cases in
> the reset of the USB subsystem, possibly ticking bugs.
> 
> So I guess I would summarize all the above as:
> 
> If hardware is broken in just the right way then this patch could
> cause a nearly infinite unbinding/rebinding of the r8152 driver.
> However:
> 
> 1. It doesn't seem terribly likely for hardware to be broken in just this way.
> 
> 2. We haven't seen hardware broken in just this way.
> 
> 3. Hardware broken in a slightly different way could cause infinite
> unbinding/rebinding even without this patch.
> 
> 4. Infinite unbinding/rebinding of a USB adapter isn't great, but not
> the absolute worst thing.

It is fine if everyone agrees these.

Best Regards,
Hayes