[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <d1e1a702-753f-ec83-ed71-7e3649b20757@suse.cz>
Date: Fri, 30 Mar 2018 15:17:16 +0200
From: Jiri Slaby <jslaby@...e.cz>
To: hayeswang@...ltek.com
Cc: USB list <linux-usb@...r.kernel.org>,
ML netdev <netdev@...r.kernel.org>,
"Rafael J. Wysocki" <rjw@...ysocki.net>,
Linux PM list <linux-pm@...r.kernel.org>
Subject: r8152 livelocks during pm_runtime_suspend
Hi,
I have seen r8152 from my docking station to kill my box several times
in the last few days. The notebook is new, so I don't know if this is a
regression.
I have the NIC connected all the time. And when I return to the notebook
after a while, the networking is dead. Looking at the stack traces, it
is clear, that r8152 was attempted to be autosuspended and waits in
napi_disable for NAPI_STATE_SCHED bit to be cleared:
[22001.018437] kworker/2:0 D 0 16267 2 0x80000000
[22001.018441] Workqueue: pm pm_runtime_work
[22001.018443] Call Trace:
[22001.018453] schedule+0x2f/0x90
[22001.018455] schedule_timeout+0x1ce/0x540
[22001.018474] msleep+0x29/0x30
[22001.018477] napi_disable+0x25/0x60
[22001.018483] rtl8152_suspend+0x20a/0x2d0 [r8152]
[22001.018493] usb_suspend_both+0x8d/0x200 [usbcore]
[22001.018510] usb_runtime_suspend+0x2a/0x70 [usbcore]
[22001.018514] __rpm_callback+0xbc/0x1f0
[22001.018519] rpm_callback+0x4f/0x70
[22001.018526] rpm_suspend+0x11d/0x6d0
[22001.018532] pm_runtime_work+0x73/0xb0
[22001.018535] process_one_work+0x269/0x6c0
[22001.018541] worker_thread+0x2b/0x3d0
[22001.018547] kthread+0x113/0x130
[22001.018556] ret_from_fork+0x24/0x50
The assembly:
> ffffffff81716730 <napi_disable>:
> ffffffff81716730: e8 eb b7 2e 00 callq ffffffff81a01f20 <__fentry__>
> ffffffff81716735: 55 push %rbp
> ffffffff81716736: 48 89 fd mov %rdi,%rbp
> ffffffff81716739: 53 push %rbx
> ffffffff8171673a: 48 8d 5f 10 lea 0x10(%rdi),%rbx
> ffffffff8171673e: f0 80 4f 10 04 lock orb $0x4,0x10(%rdi)
> ffffffff81716743: f0 0f ba 6f 10 00 lock btsl $0x0,0x10(%rdi)
> ffffffff81716749: 73 11 jae ffffffff8171675c <napi_disable+0x2c>
> ffffffff8171674b: bf 01 00 00 00 mov $0x1,%edi
> ffffffff81716750: e8 ab ac a0 ff callq ffffffff81121400 <msleep>
> ffffffff81716755: f0 0f ba 2b 00 lock btsl $0x0,(%rbx)
> ffffffff8171675a: 72 ef jb ffffffff8171674b <napi_disable+0x1b>
There are other tasks in D state, of course, like these, waiting for the
device to become pm-up:
[22001.018749] kworker/3:1 D 0 16798 2 0x80000000
[22001.018753] Workqueue: events rtl_work_func_t [r8152]
[22001.018755] Call Trace:
[22001.018767] schedule+0x2f/0x90
[22001.018769] rpm_resume+0xf9/0x860
[22001.018777] rpm_resume+0x592/0x860
[22001.018783] __pm_runtime_resume+0x3a/0x50
[22001.018789] usb_autopm_get_interface+0x1d/0x50 [usbcore]
[22001.018793] rtl_work_func_t+0x3e/0x405 [r8152]
[22001.018801] process_one_work+0x269/0x6c0
[22001.018807] worker_thread+0x2b/0x3d0
[22001.018813] kthread+0x113/0x130
[22001.018822] ret_from_fork+0x24/0x50
[22001.019713] tcpdump D 0 17119 4265 0x00000004
[22001.019716] Call Trace:
[22001.019728] schedule+0x2f/0x90
[22001.019730] rpm_resume+0xf9/0x860
[22001.019738] rpm_resume+0x592/0x860
[22001.019744] __pm_runtime_resume+0x3a/0x50
[22001.019750] usb_autopm_get_interface+0x1d/0x50 [usbcore]
[22001.019754] rtl8152_ioctl+0x30/0x140 [r8152]
[22001.019758] dev_ifsioc+0x115/0x3f0
[22001.019763] dev_ioctl+0x14b/0x680
[22001.019775] sock_do_ioctl+0x41/0x50
[22001.019778] sock_ioctl+0x1c2/0x2f0
[22001.019781] do_vfs_ioctl+0x91/0x680
[22001.019789] SyS_ioctl+0x74/0x80
[22001.019794] do_syscall_64+0x76/0x1c0
...
> Showing all locks held in the system:
> 1 lock held by in:imklog/1371:
> #0: (&f->f_pos_lock){+.+.}, at: [<00000000a0b38807>] __fdget_pos+0x3f/0x50
> 1 lock held by Qt bearer threa/3003:
> #0: (rtnl_mutex){+.+.}, at: [<0000000021e0bca0>] __netlink_dump_start+0x4c/0x1b0
> 1 lock held by Qt bearer threa/2825:
> #0: (rtnl_mutex){+.+.}, at: [<0000000021e0bca0>] __netlink_dump_start+0x4c/0x1b0
> 1 lock held by DNS Res~ver #40/17041:
> #0: (rtnl_mutex){+.+.}, at: [<0000000021e0bca0>] __netlink_dump_start+0x4c/0x1b0
> 1 lock held by Qt bearer threa/3110:
> #0: (rtnl_mutex){+.+.}, at: [<0000000021e0bca0>] __netlink_dump_start+0x4c/0x1b0
> 1 lock held by DNS Res~ver #16/17044:
> #0: (rtnl_mutex){+.+.}, at: [<0000000021e0bca0>] __netlink_dump_start+0x4c/0x1b0
> 2 locks held by bash/4561:
> #0: (&tty->ldisc_sem){++++}, at: [<00000000e3d76e61>] tty_ldisc_ref_wait+0x24/0x50
> #1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000091462d05>] n_tty_read+0xc3/0x850
> 3 locks held by kworker/2:0/16267:
> #0: ((wq_completion)"pm"){+.+.}, at: [<00000000b9dc0832>] process_one_work+0x1e3/0x6c0
> #1: ((work_completion)(&dev->power.work)){+.+.}, at: [<00000000b9dc0832>] process_one_work+0x1e3/0x6c0
> #2: (&tp->control){+.+.}, at: [<00000000ca575b90>] rtl8152_suspend+0x2b/0x2d0 [r8152]
> 2 locks held by kworker/3:1/16798:
> #0: ((wq_completion)"events"){+.+.}, at: [<00000000b9dc0832>] process_one_work+0x1e3/0x6c0
> #1: ((work_completion)(&(&tp->schedule)->work)){+.+.}, at: [<00000000b9dc0832>] process_one_work+0x1e3/0x6c0
> 1 lock held by tcpdump/17119:
> #0: (rtnl_mutex){+.+.}, at: [<0000000023a6461d>] dev_ioctl+0x13d/0x680
> 2 locks held by less/17187:
> #0: (&tty->ldisc_sem){++++}, at: [<00000000e3d76e61>] tty_ldisc_ref_wait+0x24/0x50
> #1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000091462d05>] n_tty_read+0xc3/0x850
For now, I disabled pm-runtime on the device by:
echo on > /sys/bus/usb/devices/4-1.2/power/control
Any ideas what's wrong? napi_disable from runtime suspend? Double
napi_disable on the path? Some missing pm_runtime_get_sync somewhere?
Full dmesg after sysrq-t:
https://www.fi.muni.cz/~xslaby/sklad/panics/r8152.txt
thanks,
--
js
suse labs
Powered by blists - more mailing lists