lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 04 Sep 2012 12:36:35 +0200
From:	Bjørn Mork <bjorn@...k.no>
To:	Ming Lei <ming.lei@...onical.com>
Cc:	netdev@...r.kernel.org, linux-usb@...r.kernel.org,
	Oliver Neukum <oneukum@...e.de>
Subject: Re: [PATCH net] net: usbnet: fix softirq storm on suspend

Ming Lei <ming.lei@...onical.com> writes:
> On Tue, Sep 4, 2012 at 3:49 PM, Bjørn Mork <bjorn@...k.no> wrote:
>>
>> That's easy:
>>
>> - Take any usbnet device supporting remote wakeup (and of course with a
>>   minidriver supporting it as well),
>> - enable autosuspend,
>> - ip link set dev ethX/usbX/wwanX up
>>
>> And watch ksoftirqd/X use 100% of one of your CPU cores.
>>
>> I'd very much like to hear the details if you are unable to reproduce
>> the bug using this simple test.
>
> I have no such kind of devices, and can't reproduce the problem with
> a cdc-ether gadget built by one omap3 beagleboard.

I believe any Ericsson or Gobi modem would do, and most likely other USB
networking devices too.

I haven't explored the gadget.  Doesn't it support remote wakeup?  Well,
it doesn't really have to just for testing this.  You just have to fake
the remote wakeup support, either in the gadget or in the device
driver.  It doesn't matter whether it works or not.  The point is making
the driver suspend the USB device while the network device is running.

>>> Once usbnet_terminate_urbs and netif_device_attach() are
>>> completed, who will schedule tasklet to trigger rx_alloc_submit?
>>
>> That's a good question.  It sure would be nice if that never happened
>> without waking the device first.  But there are just too many call sites
>> for me to follow:
>
> To be honest, your patch is correct, and I am just curious about
> the reason why tasklet is scheduled after the device is suspended.
>
>>
>> bjorn@...i:/usr/local/src/git/linux$ grep tasklet_schedule drivers/net/usb/usbnet.c
>>                 tasklet_schedule(&dev->bh);
>>  * but tasklet_schedule() doesn't.  hope the failure is rare.
>>                         tasklet_schedule (&dev->bh);
>>         tasklet_schedule(&dev->bh);
>>                 tasklet_schedule(&dev->bh);
>>         tasklet_schedule (&dev->bh);
>>                         tasklet_schedule (&dev->bh);
>>                                 tasklet_schedule (&dev->bh);
>>         tasklet_schedule (&dev->bh);
>>                                 tasklet_schedule (&dev->bh);
>>                         tasklet_schedule (&dev->bh);
>
> Could you add some debug info inside the caller to see which one
> cause the tasklet to be scheduled?


Goode idea.  I replaced the tasklet_schedule() calls with

#define DBG_TASKLET(bh) { \
 pr_err("%s scheduling tasklet %p\n", __func__, bh); \
 tasklet_schedule(bh); \
}

and added a useless WARN_ON around the test:

		if (temp < RX_QLEN(dev)) {
			if (WARN_ON_ONCE(rx_alloc_submit(dev, GFP_ATOMIC) == -ENOLINK))
				return;


and got:

Sep  4 12:22:48 nemi kernel: [50270.964818] qmi_wwan 2-4:1.8: wdm1_resume
Sep  4 12:22:48 nemi kernel: [50270.964832] qmi_wwan 2-4:1.19: wdm2_resume
Sep  4 12:22:48 nemi kernel: [50270.964845] usbnet_open scheduling tasklet ffff88014a8529d0
Sep  4 12:22:48 nemi kernel: [50270.964857] qmi_wwan 2-4:1.8: qmi_wwan_manage_power() pmcount=0, on=1
Sep  4 12:22:48 nemi kernel: [50270.964921] qmi_wwan 2-4:1.8: wwan1: rxqlen 0 --> 10
Sep  4 12:22:48 nemi kernel: [50270.964926] usbnet_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:48 nemi kernel: [50270.964962] qmi_wwan 2-4:1.8: wwan1: rxqlen 10 --> 20
Sep  4 12:22:48 nemi kernel: [50270.964966] usbnet_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:48 nemi kernel: [50270.965016] qmi_wwan 2-4:1.8: wwan1: rxqlen 20 --> 30
Sep  4 12:22:48 nemi kernel: [50270.965020] usbnet_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:48 nemi kernel: [50270.965057] qmi_wwan 2-4:1.8: wwan1: rxqlen 30 --> 40
Sep  4 12:22:48 nemi kernel: [50270.965061] usbnet_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:48 nemi kernel: [50270.965097] qmi_wwan 2-4:1.8: wwan1: rxqlen 40 --> 50
Sep  4 12:22:48 nemi kernel: [50270.965101] usbnet_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:48 nemi kernel: [50270.965136] qmi_wwan 2-4:1.8: wwan1: rxqlen 50 --> 60
Sep  4 12:22:48 nemi kernel: [50270.972270] defer_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:48 nemi kernel: [50271.040290] defer_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:48 nemi kernel: [50271.240286] defer_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:49 nemi kernel: [50272.040536] defer_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:51 nemi kernel: [50274.040121] qmi_wwan 2-4:1.19: wdm2_suspend
Sep  4 12:22:51 nemi kernel: [50274.040234] qmi_wwan 2-4:1.8: wdm1_suspend
Sep  4 12:22:51 nemi kernel: [50274.040263] defer_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:51 nemi kernel: [50274.040599] ------------[ cut here ]------------
Sep  4 12:22:51 nemi kernel: [50274.040612] WARNING: at /usr/local/src/git/linux/drivers/net/usb/usbnet.c:1271 tasklet_action+0x76/0xc4()
Sep  4 12:22:51 nemi kernel: [50274.040617] Hardware name: 2776LEG
Sep  4 12:22:51 nemi kernel: [50274.040620] Modules linked in: qmi_wwan(O) usbnet(O) nfnetlink_log nfnetlink option nls_utf8 isofs usb_storage uas usbmon cdc_wdm(O) mii rfcomm bnep xt_multiport iptable_filter ip_tables cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave binfmt_misc xt_hl ip6table_filter ip6_tables x_tables fuse nfsd nfs_acl nfs lockd fscache sunrpc 8021q garp stp llc tun ext2 loop snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm arc4 iwldvm mac80211 snd_page_alloc thinkpad_acpi nvram snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq coretemp kvm_intel mei btusb qcserial kvm usb_wwan usbserial bluetooth crc16 snd_timer snd_seq_device i915 microcode iwlwifi psmouse serio_raw snd evdev lpc_ich drm_kms_helper i2c_i801 mfd_core cfg80211 drm i2c_algo_bit i2c_core acpi_cpufreq wmi soundcore rfkill battery ac mperf processor button video ext3 mbcache jbd sha256_generic ablk_helper cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod nbd s
Sep  4 12:22:51 nemi kernel: g sd_mod crc_t10dif sr_mod cdrom thermal thermal_sys ahci libahci libata uhci_hcd ehci_hcd scsi_mod e1000e usbcore usb_common [last unloaded: usbnet]
Sep  4 12:22:51 nemi kernel: [50274.040857] Pid: 7758, comm: kworker/0:0 Tainted: G        W  O 3.6.0-rc4+ #32
Sep  4 12:22:51 nemi kernel: [50274.040861] Call Trace:
Sep  4 12:22:51 nemi kernel: [50274.040865]  <IRQ>  [<ffffffff8103e0ed>] ? warn_slowpath_common+0x78/0x8c
Sep  4 12:22:51 nemi kernel: [50274.040883]  [<ffffffff810457b2>] ? tasklet_action+0x76/0xc4
Sep  4 12:22:51 nemi kernel: [50274.040890]  [<ffffffff810451e5>] ? __do_softirq+0xd9/0x1d1
Sep  4 12:22:51 nemi kernel: [50274.040898]  [<ffffffff8109f761>] ? handle_irq_event_percpu+0x172/0x191
Sep  4 12:22:51 nemi kernel: [50274.040908]  [<ffffffff81371bfc>] ? call_softirq+0x1c/0x30
Sep  4 12:22:51 nemi kernel: [50274.040918]  [<ffffffff8100fa4b>] ? do_softirq+0x3f/0x79
Sep  4 12:22:51 nemi kernel: [50274.040924]  [<ffffffff81044f9e>] ? irq_exit+0x44/0xb1
Sep  4 12:22:51 nemi kernel: [50274.040931]  [<ffffffff8100f326>] ? do_IRQ+0x94/0xaa
Sep  4 12:22:51 nemi kernel: [50274.040941]  [<ffffffff8136b5ea>] ? common_interrupt+0x6a/0x6a
Sep  4 12:22:51 nemi kernel: [50274.040944]  <EOI>  [<ffffffff8126fa99>] ? dev_get_drvdata+0x15/0x19
Sep  4 12:22:51 nemi kernel: [50274.040969]  [<ffffffffa04ee4ca>] ? usb_wwan_suspend+0x70/0xd7 [usb_wwan]
Sep  4 12:22:51 nemi kernel: [50274.040983]  [<ffffffffa02aa5d0>] ? usb_serial_suspend+0x35/0x74 [usbserial]
Sep  4 12:22:51 nemi kernel: [50274.041010]  [<ffffffffa00d313b>] ? usb_suspend_both+0x8e/0x19c [usbcore]
Sep  4 12:22:51 nemi kernel: [50274.041035]  [<ffffffffa00d3271>] ? usb_runtime_suspend+0x28/0x4c [usbcore]
Sep  4 12:22:51 nemi kernel: [50274.041043]  [<ffffffff81277943>] ? __rpm_callback+0x33/0x57
Sep  4 12:22:51 nemi kernel: [50274.041050]  [<ffffffff81277ee1>] ? rpm_suspend+0x30e/0x53c
Sep  4 12:22:51 nemi kernel: [50274.041059]  [<ffffffff81278e1a>] ? pm_runtime_set_autosuspend_delay+0x4b/0x4b
Sep  4 12:22:51 nemi kernel: [50274.041066]  [<ffffffff81278e8e>] ? pm_runtime_work+0x74/0x94
Sep  4 12:22:51 nemi kernel: [50274.041075]  [<ffffffff81055d3d>] ? process_one_work+0x1ff/0x311
Sep  4 12:22:51 nemi kernel: [50274.041083]  [<ffffffff8105604a>] ? worker_thread+0x1fb/0x2fb
Sep  4 12:22:51 nemi kernel: [50274.041091]  [<ffffffff81055e4f>] ? process_one_work+0x311/0x311
Sep  4 12:22:51 nemi kernel: [50274.041099]  [<ffffffff81055e4f>] ? process_one_work+0x311/0x311
Sep  4 12:22:51 nemi kernel: [50274.041106]  [<ffffffff81059861>] ? kthread+0x81/0x89
Sep  4 12:22:51 nemi kernel: [50274.041115]  [<ffffffff81371b04>] ? kernel_thread_helper+0x4/0x10
Sep  4 12:22:51 nemi kernel: [50274.041123]  [<ffffffff810597e0>] ? kthread_freezable_should_stop+0x53/0x53
Sep  4 12:22:51 nemi kernel: [50274.041130]  [<ffffffff81371b00>] ? gs_change+0x13/0x13
Sep  4 12:22:51 nemi kernel: [50274.041135] ---[ end trace 8293423100f8e272 ]---
Sep  4 12:22:53 nemi kernel: [50276.048112] qmi_wwan 2-4:1.8: wwan1: Delaying transmission for resumption
Sep  4 12:22:53 nemi kernel: [50276.097673] qmi_wwan 2-4:1.8: wdm1_resume
Sep  4 12:22:53 nemi kernel: [50276.097743] usbnet_resume scheduling tasklet ffff88014a8529d0
Sep  4 12:22:53 nemi kernel: [50276.097754] qmi_wwan 2-4:1.19: wdm2_resume
Sep  4 12:22:53 nemi kernel: [50276.097789] defer_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:53 nemi kernel: [50276.097847] qmi_wwan 2-4:1.8: wwan1: rxqlen 10 --> 20
Sep  4 12:22:53 nemi kernel: [50276.097851] usbnet_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:53 nemi kernel: [50276.097890] qmi_wwan 2-4:1.8: wwan1: rxqlen 20 --> 30
Sep  4 12:22:53 nemi kernel: [50276.097895] usbnet_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:53 nemi kernel: [50276.097930] qmi_wwan 2-4:1.8: wwan1: rxqlen 30 --> 40
Sep  4 12:22:53 nemi kernel: [50276.097935] usbnet_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:53 nemi kernel: [50276.097970] qmi_wwan 2-4:1.8: wwan1: rxqlen 40 --> 50
Sep  4 12:22:53 nemi kernel: [50276.097974] usbnet_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:53 nemi kernel: [50276.098010] qmi_wwan 2-4:1.8: wwan1: rxqlen 50 --> 60
Sep  4 12:22:56 nemi kernel: [50278.824340] qmi_wwan 2-4:1.19: wdm2_suspend
Sep  4 12:22:56 nemi kernel: [50278.824441] defer_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:56 nemi kernel: [50278.824708] qmi_wwan 2-4:1.8: wdm1_suspend
Sep  4 12:22:56 nemi kernel: [50278.824833] defer_bh scheduling tasklet ffff88014a8529d0
Sep  4 12:22:56 nemi kernel: [50278.824916] ------------[ cut here ]------------
Sep  4 12:22:56 nemi kernel: [50278.824929] WARNING: at /usr/local/src/git/linux/drivers/net/usb/usbnet.c:1271 tasklet_action+0x76/0xc4()
Sep  4 12:22:56 nemi kernel: [50278.824933] Hardware name: 2776LEG
Sep  4 12:22:56 nemi kernel: [50278.824936] Modules linked in: qmi_wwan(O) usbnet(O) nfnetlink_log nfnetlink option nls_utf8 isofs usb_storage uas usbmon cdc_wdm(O) mii rfcomm bnep xt_multiport iptable_filter ip_tables cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave binfmt_misc xt_hl ip6table_filter ip6_tables x_tables fuse nfsd nfs_acl nfs lockd fscache sunrpc 8021q garp stp llc tun ext2 loop snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm arc4 iwldvm mac80211 snd_page_alloc thinkpad_acpi nvram snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq coretemp kvm_intel mei btusb qcserial kvm usb_wwan usbserial bluetooth crc16 snd_timer snd_seq_device i915 microcode iwlwifi psmouse serio_raw snd evdev lpc_ich drm_kms_helper i2c_i801 mfd_core cfg80211 drm i2c_algo_bit i2c_core acpi_cpufreq wmi soundcore rfkill battery ac mperf processor button video ext3 mbcache jbd sha256_generic ablk_helper cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod nbd s
Sep  4 12:22:56 nemi kernel: g sd_mod crc_t10dif sr_mod cdrom thermal thermal_sys ahci libahci libata uhci_hcd ehci_hcd scsi_mod e1000e usbcore usb_common [last unloaded: usbnet]
Sep  4 12:22:56 nemi kernel: [50278.825189] Pid: 0, comm: swapper/0 Tainted: G        W  O 3.6.0-rc4+ #32
Sep  4 12:22:56 nemi kernel: [50278.825193] Call Trace:
Sep  4 12:22:56 nemi kernel: [50278.825197]  <IRQ>  [<ffffffff8103e0ed>] ? warn_slowpath_common+0x78/0x8c
Sep  4 12:22:56 nemi kernel: [50278.825215]  [<ffffffff810457b2>] ? tasklet_action+0x76/0xc4
Sep  4 12:22:56 nemi kernel: [50278.825222]  [<ffffffff810451e5>] ? __do_softirq+0xd9/0x1d1
Sep  4 12:22:56 nemi kernel: [50278.825231]  [<ffffffff8109f761>] ? handle_irq_event_percpu+0x172/0x191
Sep  4 12:22:56 nemi kernel: [50278.825241]  [<ffffffff81371bfc>] ? call_softirq+0x1c/0x30
Sep  4 12:22:56 nemi kernel: [50278.825250]  [<ffffffff8100fa4b>] ? do_softirq+0x3f/0x79
Sep  4 12:22:56 nemi kernel: [50278.825257]  [<ffffffff81044f9e>] ? irq_exit+0x44/0xb1
Sep  4 12:22:56 nemi kernel: [50278.825264]  [<ffffffff8100f326>] ? do_IRQ+0x94/0xaa
Sep  4 12:22:56 nemi kernel: [50278.825274]  [<ffffffff8136b5ea>] ? common_interrupt+0x6a/0x6a
Sep  4 12:22:56 nemi kernel: [50278.825277]  <EOI>  [<ffffffffa0179150>] ? acpi_idle_enter_bm+0x276/0x2ad [processor]
Sep  4 12:22:56 nemi kernel: [50278.825314]  [<ffffffffa017914c>] ? acpi_idle_enter_bm+0x272/0x2ad [processor]
Sep  4 12:22:56 nemi kernel: [50278.825325]  [<ffffffff8129046f>] ? cpuidle_enter_state+0xa/0x30
Sep  4 12:22:56 nemi kernel: [50278.825332]  [<ffffffff81290bba>] ? cpuidle_idle_call+0xd4/0x14e
Sep  4 12:22:56 nemi kernel: [50278.825340]  [<ffffffff81014c19>] ? cpu_idle+0xae/0xf7
Sep  4 12:22:56 nemi kernel: [50278.825349]  [<ffffffff816a9ccb>] ? start_kernel+0x3b8/0x3c3
Sep  4 12:22:56 nemi kernel: [50278.825357]  [<ffffffff816a974c>] ? kernel_init+0x1c9/0x1c9
Sep  4 12:22:56 nemi kernel: [50278.825365]  [<ffffffff816a93e1>] ? x86_64_start_kernel+0x102/0x10f
Sep  4 12:22:56 nemi kernel: [50278.825371] ---[ end trace 8293423100f8e273 ]---
Sep  4 12:22:56 nemi kernel: [50279.502537] qmi_wwan 2-4:1.8: qmi_wwan_manage_power() pmcount=1, on=0
Sep  4 12:22:57 nemi kernel: [50279.549214] qmi_wwan 2-4:1.8: wdm1_resume
Sep  4 12:22:57 nemi kernel: [50279.549229] qmi_wwan 2-4:1.19: wdm2_resume
Sep  4 12:22:59 nemi kernel: [50281.824317] qmi_wwan 2-4:1.19: wdm2_suspend
Sep  4 12:22:59 nemi kernel: [50281.824335] qmi_wwan 2-4:1.8: wdm1_suspend



So the code trigging this seems to be 


static enum skb_state defer_bh(struct usbnet *dev, struct sk_buff *skb,
		struct sk_buff_head *list, enum skb_state state)
{
	unsigned long		flags;
	enum skb_state 		old_state;
	struct skb_data *entry = (struct skb_data *) skb->cb;

	spin_lock_irqsave(&list->lock, flags);
	old_state = entry->state;
	entry->state = state;
	__skb_unlink(skb, list);
	spin_unlock(&list->lock);
	spin_lock(&dev->done.lock);
	__skb_queue_tail(&dev->done, skb);
	if (dev->done.qlen == 1)
		tasklet_schedule(&dev->bh);
	spin_unlock_irqrestore(&dev->done.lock, flags);
	return old_state;
}


Hmm, I should probably dump stack here as well.. Anyway, it's a start.
I certainly have no clue if this coode could or should be changed.  I
just don't understand all the finer details.  So I don't think I am
going to try refining this any further.  The old strategy worked before
3.5 and it works after my fix.


Bjørn
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ