linux-kernel - Re: [PATCH] USB:bugfix a controller halt error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bfee90c1-a7ca-27e3-88f9-936f48cd2595@huawei.com>
Date:   Wed, 26 Jul 2023 14:58:18 +0800
From:   liulongfang <liulongfang@...wei.com>
To:     Alan Stern <stern@...land.harvard.edu>
CC:     <gregkh@...uxfoundation.org>, <linux-usb@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] USB:bugfix a controller halt error

On 2023/7/21 22:57, Alan Stern Wrote:
> On Fri, Jul 21, 2023 at 06:00:15PM +0800, liulongfang wrote:
>> On systems that use ECC memory. The ECC error of the memory will
>> cause the USB controller to halt. It causes the usb_control_msg()
>> operation to fail.
> 
> How often does this happen in real life?  (Besides, it seems to me that 
> if your system is getting a bunch of ECC memory errors then you've got 
> much worse problems than a simple USB failure!)
>

This problem is on ECC memory platform.
In the test scenario, the problem is 100% reproducible.

> And why do you worry about ECC memory failures in particular?  Can't 
> _any_ kind of failure cause the usb_control_msg() operation to fail?
> 
>> At this point, the returned buffer data is an abnormal value, and
>> continuing to use it will lead to incorrect results.
> 
> The driver already contains code to check for abnormal values.  The 
> check is not perfect, but it should prevent things from going too 
> badly wrong.
>

If it is ECC memory error. These parameter checks would also
actually be invalid.

>> Therefore, it is necessary to judge the return value and exit.
>>
>> Signed-off-by: liulongfang <liulongfang@...wei.com>
> 
> There is a flaw in your reasoning.
> 
> The operation carried out here is deliberately unsafe (for full-speed 
> devices).  It is made before we know the actual maxpacket size for ep0, 
> and as a result it might return an error code even when it works okay.  
> This shouldn't happen, but a lot of USB hardware is unreliable.
> 
> Therefore we must not ignore the result merely because r < 0.  If we do 
> that, the kernel might stop working with some devices.
> 
It may be that the handling solution for ECC errors is different from that
of the OS platform. On the test platform, after usb_control_msg() fails,
reading the memory data of buf will directly lead to kernel crash:

[ T14] Call trace:
[ T14] hub_port_init+0x280/0x9f0
[ T14] hub_port_connect+0x1d4/0xa40
[ T14] hub_port_connect_change+0xb8/0x2b0
[ T14] port_event+0x430/0x5d0
[ T14] hub_event+0x138/0x4a0
[ T14] process_one_work+0x1c8/0x39c
[ T14] worker_thread+0x150/0x3d0
[ T14] kthread+0xfc/0x130
[ T14] ret_from_fork+0x10/0x18
[ T14] Code: 528000c2 b9007fea 94002c9a b9407fea (39401f41)

thanks,
Longfang.
> Alan Stern
> 
>> ---
>>  drivers/usb/core/hub.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
>> index a739403a9e45..6a43198be263 100644
>> --- a/drivers/usb/core/hub.c
>> +++ b/drivers/usb/core/hub.c
>> @@ -4891,6 +4891,16 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
>>  					USB_DT_DEVICE << 8, 0,
>>  					buf, GET_DESCRIPTOR_BUFSIZE,
>>  					initial_descriptor_timeout);
>> +				/* On systems that use ECC memory, ECC errors can
>> +				 * cause the USB controller to halt.
>> +				 * It causes this operation to fail. At this time,
>> +				 * the buf data is an abnormal value and needs to be exited.
>> +				 */
>> +				if (r < 0) {
>> +					kfree(buf);
>> +					goto fail;
>> +				}
>> +
>>  				switch (buf->bMaxPacketSize0) {
>>  				case 8: case 16: case 32: case 64: case 255:
>>  					if (buf->bDescriptorType ==
>> -- 
>> 2.24.0
>>
> 
> .
>