lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 26 Jun 2009 10:13:59 -0700
From:	Dhananjay Phadke <dhananjay.phadke@...gic.com>
To:	Andrew Morton <akpm@...ux-foundation.org>,
	"bugme-daemon@...zilla.kernel.org" <bugme-daemon@...zilla.kernel.org>
CC:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"bugzilla-daemon@...zilla.kernel.org" 
	<bugzilla-daemon@...zilla.kernel.org>,
	Amit Salecha <amit.salecha@...gic.com>,
	David Miller <davem@...emloft.net>
Subject: Re: [Bugme-new] [Bug 13617] New: GRO:__napi_complete from net_rx_action
 crash

mea culpa, likely driver can wait more for rx to drain
so that we race with napi disable.

Although, I have question for Dave. If napi code is
anyway forcing napi completion, should it not flush
gro flows also? This code predates GRO.

-Dhananjay

Andrew Morton wrote:
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> 
> netdev core crashed.  The netxen driver may be implicated.
> 
> 
> Why did amit@...xen.com create this bug report?  Isn't Dhananjay
> sitting in the next cube?  Perhaps you believe that the driver is OK
> and that the bug lies in the netdev core?
> 
> 
> 
> On Thu, 25 Jun 2009 06:55:14 GMT
> bugzilla-daemon@...zilla.kernel.org wrote:
> 
>> http://bugzilla.kernel.org/show_bug.cgi?id=13617
>>
>>            Summary: GRO:__napi_complete from net_rx_action crash
>>            Product: Drivers
>>            Version: 2.5
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: Network
>>         AssignedTo: drivers_network@...nel-bugs.osdl.org
>>         ReportedBy: amit@...xen.com
>>         Regression: No
>>
>>
>> In net_rx_action, there is check if napi_disable_pending then call
>> __napi_complete.
>> In __napi_complete, there is BUG_ON(n->gro_list);
>> Which has hit in below bug dump.
>> Why __napi_complete is called from net_rx_action instead of napi_complete.
>> napi_complete flushes the gro list.
>>
>> Below code excerpt from net_rx_action 
>> http://lxr.linux.no/linux+v2.6.30/net/core/dev.c#L2736
>>
>>    if (unlikely(work == weight)) {
>> 2791       if (unlikely(napi_disable_pending(n)))
>> 2792              __napi_complete(n);
>> 2793        else
>> 2794              list_move_tail(&n->poll_list, list);
>> 2795   }
>>
>> ------------[ cut here ]------------
>> kernel BUG at net/core/dev.c:2672!
>> invalid opcode: 0000 [#1] SMP 
>> last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
>> CPU 2 
>> Modules linked in: netxen_nic nfs lockd nfs_acl auth_rpcgss ipv6 deflate
>> zlib_deflate ctr twofish twofish_common serpent blowfish des_generic cbc
>> aes_x86_64 aes_generic xcbc sha256_generic md5 crypto_null af_key autofs4
>> sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror
>> dm_region_hash dm_log dm_multipath dm_mod video output sbs sbshc pci_slot
>> battery acpi_memhotplug ac parport ipmi_devintf ide_cd_mod rtc_cmos bnx2 cdrom
>> serio_raw ipmi_si rtc_core button ipmi_msghandler iTCO_wdt rtc_lib shpchp hpilo
>> hpwdt i5000_edac pcspkr edac_core ata_piix libata sd_mod scsi_mod cciss ext3
>> jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
>> Pid: 0, comm: swapper Tainted: G        W  2.6.30 #1 ProLiant DL380 G5
>> RIP: 0010:[<ffffffff8043b128>]  [<ffffffff8043b128>] __napi_complete+0x15/0x25
>> RSP: 0018:ffff880028139eb0  EFLAGS: 00010086
>> RAX: ffff88023d4056b8 RBX: ffff88023d4056a8 RCX: 0000000002202318
>> RDX: 00000000001b0000 RSI: ffff880028139d98 RDI: ffff88023d4056a8
>> RBP: 0000000000000080 R08: 0000000002200000 R09: 000006de15931680
>> R10: ffffc20011a32318 R11: 0000000000000005 R12: 0000000000000000
>> R13: ffff8800281440e0 R14: 0000000000000080 R15: 000000000000012c
>> FS:  0000000000000000(0000) GS:ffff880028136000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>> CR2: 00000000008cb530 CR3: 000000023d9ab000 CR4: 00000000000006e0
>> Jun 23 23:41:27 DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process swapper (pid: 0, threadinfo ffff88023ed28000, task ffff88023ed27570)
>> Stack:
>>  ffff88023d4056a8 ffffffff8043ec9f 0000000000000001dut4146 last mes
>> 000000010004f429
>>  ffff88023d4056b8sage repeated 6  0000000000000046 0000000000000001
>> 0000000000000100times
>> Jun 23 23
>>  ffffffff8069a098 0000000000000018 000000000000000a:41:32 dut4146 k
>> ffffffff8023eba6
>> ernel: BUG: scheCall Trace:
>> duling while ato <IRQ> <0>mic: swapper/0/0 [<ffffffff8043ec9f>] ?
>> net_rx_action+0xf0/0x162
>>  [<ffffffff8023eba6>] ? __do_softirq+0xa3/0x163
>>  [<ffffffff8020ca7c>] ? call_softirq+0x1c/0x28
>> x10000100
>> Jun 2 [<ffffffff8020dc1a>] ? do_softirq+0x2c/0x68
>>  [<ffffffff8023eac6>] ? irq_exit+0x3f/0x7c
>>  [<ffffffff8020d46b>] ? do_IRQ+0xa9/0xbf
>> 3 23:41:32 dut41 [<ffffffff8020c353>] ? ret_from_intr+0x0/0xa
>>  <EOI> 46 kernel: Modul<0> [<ffffffff80220e41>] ?
>> hpet_legacy_next_event+0x0/0x7
>> es linked in: ne [<ffffffff80386e2c>] ? acpi_hw_register_read+0x52/0xe5
>>  [<ffffffff80394b2a>] ? acpi_idle_enter_simple+0x120/0x14e
>>  [<ffffffff80394b20>] ? acpi_idle_enter_simple+0x116/0x14e
>>  [<ffffffff8039486b>] ? acpi_idle_enter_bm+0xd5/0x274
>>  [<ffffffff8041c020>] ? cpuidle_idle_call+0x7f/0xbb
>>  [<ffffffff8020aaa5>] ? cpu_idle+0x4a/0x6d
>> Code: txen_nic nfs loc48 8d kd nfs_acl auth_43 70 48 rpcgss ipv6 defl39 c2 ate
>> zlib_deflate0f  ctr twofish two18 0e 75 fish_common serpdf ent blowfish des31
>> c9 41 _generic cbc aes58 5b _x86_64 aes_gene5d 48 89 ric xcbc sha256_c8 c3
>> generic md5 cryp53 f6 to_null af_key a47 10 01 utofs4 sunrpc is48 89 fb csi_tcp
>> libiscsi75 04 _tcp libiscsi sc0f 0b eb si_transport_iscfe 48 83 si dm_mirror
>> dm_7f 50 region_hash dm_l00 74 04 og dm_multipath <0f> dm_mod video out0b eb
>> put sbs sbshc pcfe e8 i_slot battery a1f cpi_memhotplug a10 f1 ff c parport
>> ipmi_df0 80 evintf ide_cd_mo63 10 fe d rtc_cmos bnx2 5b c3 cdrom serio_raw 53
>> 48 89 ipmi_si rtc_corefb e8 
>>  button ipmi_msgRIP  [<ffffffff8043b128>] __napi_complete+0x15/0x25
>>  RSP <ffff880028139eb0>
>> ---[ end trace 9c6b22b26aefd1b1 ]---
>> handler iTCO_wdtKernel panic - not syncing: Fatal exception in interrupt
>> Pid: 0, comm: swapper Tainted: G      D W  2.6.30 #1
>> Call Trace:
>>  <IRQ>  [<ffffffff8023a3b5>] ? panic+0x86/0x134
>>  [<ffffffff8020e348>] ? show_registers+0x211/0x21d
>>  [<ffffffff8024f5ea>] ? up+0xe/0x36
>>  [<ffffffff8023a9db>] ? release_console_sem+0x174/0x18e
>>  [<ffffffff804bdd54>] ? oops_end+0xa0/0xad
>>  [<ffffffff8020cf2c>] ? do_invalid_op+0x85/0x8f
>>  [<ffffffff8043b128>] ? __napi_complete+0x15/0x25
>>  [<ffffffffa03ebfe2>] ? netxen_nic_hw_write_wx_2M+0x24/0xa8 [netxen_nic]
>>  [<ffffffffa03ef866>] ? netxen_process_rcv_ring+0x4eb/0x501 [netxen_nic]
>>  rtc_lib shpchp  [<ffffffff8020c715>] ? invalid_op+0x15/0x20
>>  [<ffffffff8043b128>] ? __napi_complete+0x15/0x25
>>  [<ffffffff8043ec9f>] ? net_rx_action+0xf0/0x162
>>  [<ffffffff8023eba6>] ? __do_softirq+0xa3/0x163
>>  [<ffffffff8020ca7c>] ? call_softirq+0x1c/0x28
>>  [<ffffffff8020dc1a>] ? do_softirq+0x2c/0x68
>>  [<ffffffff8023eac6>] ? irq_exit+0x3f/0x7c
>>  [<ffffffff8020d46b>] ? do_IRQ+0xa9/0xbf
>>  [<ffffffff8020c353>] ? ret_from_intr+0x0/0xa
>>  <EOI>  [<ffffffff80220e41>] ? hpet_legacy_next_event+0x0/0x7
>>  [<ffffffff80386e2c>] ? acpi_hw_register_read+0x52/0xe5
>>  [<ffffffff80394b2a>] ? acpi_idle_enter_simple+0x120/0x14e
>>  [<ffffffff80394b20>] ? acpi_idle_enter_simple+0x116/0x14e
>>  [<ffffffff8039486b>] ? acpi_idle_enter_bm+0xd5/0x274
>>  [<ffffffff8041c020>] ? cpuidle_idle_call+0x7f/0xbb
>>  [<ffffffff8020aaa5>] ? cpu_idle+0x4a/0x6d
>>
>> -- 
>> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
>> ------- You are receiving this mail because: -------
>> You are on the CC list for the bug.
> 
> Checked by AVG - www.avg.com 
> Version: 8.5.374 / Virus Database: 270.12.91/2201 - Release Date: 06/25/09 17:58:00
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists