[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAeHK+yA_CxNSNCY+mmg1MhijaPD02x_VXEqjF4TOsfy3z=Z+Q@mail.gmail.com>
Date: Mon, 12 Jun 2017 18:11:05 +0200
From: Andrey Konovalov <andreyknvl@...gle.com>
To: Alan Stern <stern@...land.harvard.edu>
Cc: Felipe Balbi <balbi@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Peter Chen <peter.chen@....com>,
Krzysztof Opasiak <k.opasiak@...sung.com>,
Colin Ian King <colin.king@...onical.com>,
Felix Hädicke <felixhaedicke@....de>,
Roger Quadros <rogerq@...com>,
USB list <linux-usb@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Dmitry Vyukov <dvyukov@...gle.com>,
Kostya Serebryany <kcc@...gle.com>,
syzkaller <syzkaller@...glegroups.com>
Subject: Re: usb/gadget: potential deadlock in gadgetfs_suspend
On Fri, Jun 9, 2017 at 9:14 PM, Alan Stern <stern@...land.harvard.edu> wrote:
> On Fri, 9 Jun 2017, Andrey Konovalov wrote:
>
>> On Fri, Jun 9, 2017 at 2:41 PM, Andrey Konovalov <andreyknvl@...gle.com> wrote:
>> > Hi,
>> >
>> > I'm getting some hangs while fuzzing the kernel with syzkaller.
>> >
>> > Possibly it happens during the execution of the following syzkaller program:
>> >
>> > mmap(&(0x7f0000000000/0xb90000)=nil, (0xb90000), 0x3, 0x32,
>> > 0xffffffffffffffff, 0x0)
>> > r0 = open$usb(&(0x7f0000001000)="2f6465762f6761646765742f64756d6d795f75646300",
>> > 0xc002, 0x0)
>> > r1 = open$usb(&(0x7f0000002000)="2f6465762f6761646765742f64756d6d795f75646300",
>> > 0x1, 0x102)
>> > write$usb(r1, &(0x7f0000003000)={0x0, {0x9, 0x2, 0x1b, 0x0, 0x5, 0x0,
>> > 0x80, 0x8, 0x9, 0x4, 0x10000000, 0xfffffefffffffff9, 0x1, 0xff, 0x0,
>
> I don't understand these large constants. They're supposed to be __u8
> values. Do they get truncated to the least significant byte?
This program doesn't lead to crashes, it was a bug in syzkaller. The
format keeps changing, I'll explain it if I send another program.
>
>> > 0x8, 0x80, [{0x9, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}]}, {0x12, 0x1,
>> > 0x0, 0x0, 0x4b5, 0x7c, 0x0, 0x3, 0x4, 0x0, 0x8, 0xd686, 0x0, 0x1}},
>> > 0x31)
>> >
>> > I haven't managed to get the exact same stack trace (or any at all
>> > actually) while trying to reproduce the bug with this program, but the
>> > kernel definitely hangs.
It seems that I get the stall reports quite rarely and I can't
reproduce them. However I get the "bad spinlock magic" crashes quite
often and able to reproduce them. So I can apply debug patches.
Reproducing sometimes requires quite some time (~10 minutes), so this
seems to be some kind of race. I noticed, that often I unmount
gadgetfs before the USB device has enough time to properly initialize,
so this might be a race of unmount vs device initialization or
something similar.
>
> Can you get a usbmon trace for the test?
I wasn't able to reproduce the crash with usbmon tracing enabled.
Maybe usbmon tracing affects timings somehow and the race is quite
less likely to be triggered. I'll keep trying though.
> And can you enable debugging for the usbcore module?
>
> echo "module usbcore =p" >/sys/kernel/debug/dynamic_debug/control
>
I was able to reproduce the crash with debugging for usbcore enabled:
https://gist.github.com/xairy/52bfb302d824eb55d39776eb156812cc
>> > On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+) with
>> > Alan's patch applied.
>> >
>> > gadgetfs: bound to dummy_udc driver
>> > gadgetfs: suspended from state 2
>> > INFO: rcu_sched detected stalls on CPUs/tasks:
>> > 1-...: (0 ticks this GP) idle=966/140000000000000/0 softirq=37706/37706 fqs=5250
>> > (detected by 2, t=21002 jiffies, g=26575, c=26574, q=183)
>> > Sending NMI from CPU 2 to CPUs 1:
>> > NMI backtrace for cpu 1
>> > CPU: 1 PID: 1394 Comm: kworker/1:2 Not tainted 4.12.0-rc4+ #24
>> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> > Workqueue: usb_hub_wq hub_event
>> > task: ffff88003ebfb640 task.stack: ffffc900024fc000
>> > RIP: 0010:rep_nop arch/x86/include/asm/processor.h:619 [inline]
>> > RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:624 [inline]
>> > RIP: 0010:virt_spin_lock arch/x86/include/asm/qspinlock.h:63 [inline]
>> > RIP: 0010:queued_spin_lock_slowpath+0x20/0x1a0 kernel/locking/qspinlock.c:421
>> > RSP: 0018:ffffc900024ff9f0 EFLAGS: 00000002
>> > RAX: 000000007d72e420 RBX: ffff88007d72e768 RCX: 0000000000010000
>> > RDX: 0000000000000001 RSI: 000000007d72e420 RDI: ffff88007d72e768
>> > RBP: ffffc900024ff9f0 R08: 0000000000000006 R09: 0000000000000020
>> > R10: ffffc900024ffa50 R11: 0000000000d52301 R12: ffff88007da9c298
>> > R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff82cfc5a0
>> > FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
>> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > CR2: 00007f728a487000 CR3: 00000000371e6000 CR4: 00000000000006e0
>> > Call Trace:
>> > __raw_spin_lock include/asm-generic/qspinlock.h:103 [inline]
>> > _raw_spin_lock+0x1c/0x20 kernel/locking/spinlock.c:151
>> > spin_lock include/linux/spinlock.h:299 [inline]
>> > gadgetfs_suspend+0x32/0x90 drivers/usb/gadget/legacy/inode.c:1684
>
> Looks like it's waiting for the spinlock in gadgetfs_suspend. Nothing
> else should be holding that lock.
>
> Were any other tasks stalled?
It seems that I get these stalls quite rarely and I can't reproduce them.
>
>> > set_link_state+0x39c/0x440 drivers/usb/gadget/udc/dummy_hcd.c:455
>> > dummy_hub_control+0x3e7/0x650 drivers/usb/gadget/udc/dummy_hcd.c:2074
>> > rh_call_control drivers/usb/core/hcd.c:689 [inline]
>> > rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
>> > usb_hcd_submit_urb+0x327/0xcf0 drivers/usb/core/hcd.c:1650
>> > usb_submit_urb+0x355/0x6f0 drivers/usb/core/urb.c:542
>> > usb_start_wait_urb+0x5f/0x110 drivers/usb/core/message.c:56
>> > usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
>> > usb_control_msg+0xd9/0x120 drivers/usb/core/message.c:151
>> > usb_clear_port_feature+0x46/0x60 drivers/usb/core/hub.c:412
>> > hub_port_disable+0x65/0x1d0 drivers/usb/core/hub.c:4177
>> > hub_port_init+0x10a/0xee0 drivers/usb/core/hub.c:4648
>> > hub_port_connect drivers/usb/core/hub.c:4826 [inline]
>> > hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
>> > port_event drivers/usb/core/hub.c:5105 [inline]
>> > hub_event+0xa0b/0x16e0 drivers/usb/core/hub.c:5189
>> > process_one_work+0x1fb/0x4c0 kernel/workqueue.c:2097
>> > process_scheduled_works kernel/workqueue.c:2157 [inline]
>> > worker_thread+0x2ab/0x4c0 kernel/workqueue.c:2233
>> > kthread+0x140/0x160 kernel/kthread.c:231
>> > ret_from_fork+0x25/0x30 arch/x86/entry/entry_64.S:424
>> > Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 ba
>> > 01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb>
>> > ec 81 fe 00 01 00 00 0f 84 92 00 00 00 41 b8 01 01 00 00 b9
>
> Alan Stern
>
Powered by blists - more mailing lists