[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56B876D5.9030805@oracle.com>
Date: Mon, 8 Feb 2016 12:07:01 +0100
From: Vegard Nossum <vegard.nossum@...cle.com>
To: Marek Szyprowski <m.szyprowski@...sung.com>,
Ruslan Bilovol <ruslan.bilovol@...il.com>
Cc: Maxime Ripard <maxime.ripard@...e-electrons.com>,
Peter Chen <peter.chen@...escale.com>,
Felipe Balbi <balbi@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
LKML <linux-kernel@...r.kernel.org>,
"linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>
Subject: Re: gadgetfs regression (NULL ptr deref) since v4.4-rc7
(fixed Felipe Balbi e-mail address)
On 02/08/2016 11:19 AM, Marek Szyprowski wrote:
> Hello,
>
> On 2016-02-08 10:46, Ruslan Bilovol wrote:
>> Hi Vegard,
>>
>> On Mon, Feb 8, 2016 at 1:27 AM, Vegard Nossum
>> <vegard.nossum@...cle.com> wrote:
>>> Hi,
>>>
>>> Using gadgetfs on latest mainline, I get the following NULL pointer
>>> dereference:
>>>
>>> BUG: unable to handle kernel NULL pointer dereference at
>>> (null)
>>> IP: [<ffffffff81138e89>] __list_del_entry+0x29/0xc0
>>> PGD 34f067 PUD 355067 PMD 0
>>> Oops: 0000 [#1] DEBUG_PAGEALLOC
>>> CPU: 0 PID: 35 Comm: afl-fuzz Not tainted 4.5.0-rc2 #1
>>> task: ffff8800002b1ec0 ti: ffff88000033c000 task.ti: ffff88000033c000
>>> RIP: 0010:[<ffffffff81138e89>] [<ffffffff81138e89>]
>>> __list_del_entry+0x29/0xc0
>>> RSP: 0018:ffff88000033fe30 EFLAGS: 00010207
>>> RAX: 0000000000000000 RBX: ffffffff8138c108 RCX: dead000000000200
>>> RDX: 0000000000000000 RSI: 0000000000000061 RDI: ffffffff8138c108
>>> RBP: ffff88000033fe30 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: ffffffff8138b928
>>> R13: ffffffff8138c040 R14: ffff88000ec0da20 R15: ffff88000033ff58
>>> FS: 00007ffff7ff2740(0000) GS:ffffffff8135d000(0000)
>>> knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 0000000000000000 CR3: 0000000000335000 CR4: 00000000001406b0
>>> Stack:
>>> ffff88000033fe48 ffffffff81138f2d ffffffff8138bbd0 ffff88000033fe70
>>> ffffffff811c880b ffff8800002f3000 ffff88000ec109a0 ffff880000298aa0
>>> ffff88000033fe88 ffffffff811ce040 ffff88000034cdc0 ffff88000033feb8
>>> Call Trace:
>>> [<ffffffff81138f2d>] list_del+0xd/0x30
>>> [<ffffffff811c880b>] usb_gadget_unregister_driver+0xdb/0xf0
>>> [<ffffffff811ce040>] dev_release+0x20/0x60
>>> [<ffffffff810b464c>] __fput+0x4c/0x130
>>> [<ffffffff810b4769>] ____fput+0x9/0x10
>>> [<ffffffff81048577>] task_work_run+0x67/0xa0
>>> [<ffffffff81000dcf>] exit_to_usermode_loop+0x8f/0xa0
>>> [<ffffffff8100105d>] syscall_return_slowpath+0x3d/0x40
>>> [<ffffffff81278b89>] int_ret_from_sys_call+0x25/0x8f
>>> Code: ff ff 55 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b 57 08 48
>>> 89 e5
>>> 48 39 c8 74 29 48 b9 00 02 00 00 00 00 ad de 48 39 ca 74 3a <4c> 8b
>>> 02 4c 39
>>> c7 75 52 4c 8b 40 08 4c 39 c7 75 66 48 89 50 08
>>> RIP [<ffffffff81138e89>] __list_del_entry+0x29/0xc0
>>> RSP <ffff88000033fe30>
>>> CR2: 0000000000000000
>>> ---[ end trace e6cfe1de661dcffe ]---
>>>
>>> Reverting this commit fixes the problem for me:
>>>
>>> commit 855ed04a3758b205e84b269f92d26ab36ed8e2f7
>>> Author: Ruslan Bilovol <ruslan.bilovol@...il.com>
>>> Date: Mon Nov 23 09:56:38 2015 +0100
>>>
>>> usb: gadget: udc-core: independent registration of gadgets and
>>> gadget
>>> drivers
>>>
>>> Though I am still seeing some occasional lockups. Thanks,
>>>
>> Original version of this patch had checking of input parameters
>> (see change in usb_gadget_unregister_driver at
>> https://lkml.org/lkml/2015/6/22/559) but this approach was
>> rejected by Alan Stern many times so final version pushed by
>> Marek Szyprowski doesn't have it, and we have this NULL pointer
>> dereference.
>> But this means there is a bug in GadgetFS that was hidden
>> previously: it tries to deregister gadget driver that was not previously
>> registered. Previous implementation was returning -ENODEV
>> retval, current one just does NULL pointer dereference.
>>
>> At least need to fix GadgetFS that seems to by buggy in
>> gadget driver unregistering, and I still say we need to nave
>> input parameters checking in externally visible functions like
>> usb_gadget_unregister_driver().
>>
>> Meanwhile you can try following patch that returns checking of input
>> parameters back and see if it helps.
>>
>> Best regards,
>> Ruslan
>>
>> --------
>>
>> diff --git a/drivers/usb/gadget/udc/udc-core.c
>> b/drivers/usb/gadget/udc/udc-core.c
>> index fd73a3e..f1d375f 100644
>> --- a/drivers/usb/gadget/udc/udc-core.c
>> +++ b/drivers/usb/gadget/udc/udc-core.c
>> @@ -597,9 +597,16 @@ int usb_gadget_unregister_driver(struct
>> usb_gadget_driver *driver)
>> }
>>
>> if (ret) {
>> - list_del(&driver->pending);
>> - ret = 0;
>> + struct usb_gadget_driver *tmp;
>> +
>> + list_for_each_entry(tmp, &gadget_driver_pending_list,
>> pending)
>> + if (tmp == driver) {
>> + list_del(&driver->pending);
>> + ret = 0;
>> + break;
>> + }
>> }
>> +
>> mutex_unlock(&udc_lock);
>> return ret;
>> }
>
So the patch *seems* to fix the problem I was seeing. However...
> Sorry, but this is not the solution. If gadgetfs is broken, then it
> should be fixed. There is no point in adding workarounds in core just
> because there are drivers that can be broken.
Even if it's not a fix per se, why not modify it to throw a warning
instead of crashing? If I understand correctly, something like
+WARN_ON_ONCE(ret);
after the loop in Ruslan's patch should be enough. I added a BUG_ON(ret)
(instead of WARN_ON_ONCE) and I got this:
------------[ cut here ]------------
kernel BUG at drivers/usb/gadget/udc/udc-core.c:622!
invalid opcode: 0000 [#1] DEBUG_PAGEALLOC KASAN
CPU: 0 PID: 36 Comm: afl-fuzz Not tainted 4.5.0-rc2 #1
task: ffff8800003b6900 ti: ffff88000c840000 task.ti: ffff88000c840000
RIP: 0010:[<ffffffff8152395b>] [<ffffffff8152395b>]
usb_gadget_unregister_driver+0x18b/0x2d0
RSP: 0018:ffff88000c847de8 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffffffff81a117d8 RCX: 1ffff10001908fa6
RDX: 0000000000000001 RSI: 0000000000000061 RDI: ffff88000039c418
RBP: ffff88000c847e10 R08: 0000000000000246 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81a118e0
R13: ffffffff81a13f40 R14: dffffc0000000000 R15: ffff88000c83ca00
FS: 00007ffff7ff2740(0000) GS:ffffffff8193f000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffff78c53a0 CR3: 000000000c850000 CR4: 00000000001406b0
Stack:
ffffffff817f62a0 ffff880000277a40 ffff88000c83ca10 ffff88000c83ca20
ffff88000c83ca18 ffff88000c847e30 ffffffff81533bb4 ffff88000cc25758
ffff88000c83ca10 ffff88000c847e88 ffffffff811f0ccb 00007ffff780f0c0
Call Trace:
[<ffffffff81533bb4>] dev_release+0x44/0x110
[<ffffffff811f0ccb>] __fput+0x11b/0x490
[<ffffffff811f10a9>] ____fput+0x9/0x10
[<ffffffff810c8881>] task_work_run+0xf1/0x190
[<ffffffff811ea9ea>] ? filp_close+0x8a/0xe0
[<ffffffff81001c3c>] exit_to_usermode_loop+0xec/0x100
[<ffffffff81002531>] syscall_return_slowpath+0x91/0xc0
[<ffffffff817a4309>] int_ret_from_sys_call+0x25/0x8f
Code: f8 48 c1 e8 03 42 80 3c 20 00 0f 85 1b 01 00 00 48 8b 83 c8 00 00
00 48 3d a0 18 a1 81 48 8d 98 38 ff ff ff 75 be e8 45 b6 e6 ff <0f> 0b
e8 3e b6 e6 ff 48 89 df e8 26 d0 ff ff 48 8d 7b 08 48 b8
RIP [<ffffffff8152395b>] usb_gadget_unregister_driver+0x18b/0x2d0
RSP <ffff88000c847de8>
---[ end trace 875139a9f2b43c09 ]---
> I was still not able to reproduce this issue (tried with dwc2, gadgetfs
> and ptp gadget daemon). Vegard could you prepare a step-by-step
> instruction how to reproduce this problem?
It's a bit complicated to reproduce (and not 100% deterministic) since I
am running into this using a fuzzer.
But I am just using dummy_hcd in the following way:
- mount gadgetfs
- open dummy_hcd file
- read/write some data
- close dummy_hcd file
- unmount gadgetfs
I will write back if I find an easy way to reproduce it. In the meantime
I can test patches easily.
Vegard
Powered by blists - more mailing lists