lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 06 Jan 2011 21:32:14 +0000
From:	Iain Paton <selsinork@...il.com>
To:	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: 2.6.37 vlans on bnx2 not functional, panic with tcpdump

Hi,

vlans don't appear to be functional on my HP DL380G6 with onboard bnx2 adapter using vanilla 2.6.37 kernel. No tagged vlan traffic 
is arriving at the vlan interface.

To reproduce, use vanilla 2.6.37 built with the attached config

ip link add link eth0 name v406 type vlan id 406
ip link set up dev eth0
ip link set up dev v406
ip addr add 10.251.0.3/16 dev v406

from another machine on the same vlan run a ping to 10.251.0.3, ping returns destination host unreachable.

tcpdump -n -e -i v406  shows no traffic.

If I then run

tcpdump -n -e -i eth0

while the ping is still running I get

[  112.190114] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[  112.198912] IP: [<ffffffff813f5a34>] __skb_recv_datagram+0x124/0x2a0
[  112.214203] PGD 31fa05067 PUD 31fb51067 PMD 0
[  112.220207] Oops: 0002 [#1] SMP
[  112.228949] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:04.6/class
[  112.248201] CPU 0
[  112.251342] Modules linked in: 8021q garp stp llc
[  112.269199]
[  112.269692] Pid: 1370, comm: rpc.statd Not tainted 2.6.37-64 #1 /ProLiant DL380 G6
[  112.275164] RIP: 0010:[<ffffffff813f5a34>]  [<ffffffff813f5a34>] __skb_recv_datagram+0x124/0x2a0
[  112.293143] RSP: 0018:ffff88031fbd5a88  EFLAGS: 00010046
[  112.300238] RAX: 0000000000000246 RBX: 0000000000000000 RCX: ffff88019f91a8c0
[  112.319307] RDX: ffff88019f94b500 RSI: ffff88031fbd5b44 RDI: ffff88019f91a8d4
[  112.329271] RBP: ffff88031fbd5b28 R08: 0000000000000000 R09: 0000000000001000
[  112.339123] R10: 0000000000000000 R11: 0000000000000246 R12: ffff88031fbd5ac8
[  112.360207] R13: ffff88019f91a8c0 R14: ffff88031fbd5ae0 R15: ffff88019f91a8d4
[  112.363278] FS:  00007ff90cd77700(0000) GS:ffff8800d7200000(0000) knlGS:0000000000000000
[  112.366216] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  112.368311] CR2: 0000000000000008 CR3: 000000031fb1a000 CR4: 00000000000006f0
[  112.375050] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  112.379300] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  112.381913] Process rpc.statd (pid: 1370, threadinfo ffff88031fbd4000, task ffff88031fe00000)
[  112.387746] Stack:
[  112.388474]  ffff88031fbd5ae8 ffffffff8141eb50 7fffffffffffffff ffff88031fe00000
[  112.404676]  ffff88031fbd5bc4 ffff88031fbd5b44 000000001fbd5b88 ffff88031fe00000
[  112.411531]  ffff88019f91a800 ffff88019f8be600 000000000000055a ffffffff81b8d780
[  112.415367] Call Trace:
[  112.416223]  [<ffffffff8141eb50>] ? netlink_dump+0x1a0/0x200
[  112.418289]  [<ffffffff8141fa4d>] ? netlink_dump_start+0x18d/0x1b0
[  112.420439]  [<ffffffff813f5bcf>] skb_recv_datagram+0x1f/0x30
[  112.422686]  [<ffffffff8141eeec>] netlink_recvmsg+0x7c/0x440
[  112.424846]  [<ffffffff813f1ea2>] ? __kfree_skb+0x42/0xa0
[  112.444125]  [<ffffffff813e9cf8>] sock_recvmsg+0xf8/0x130
[  112.449889]  [<ffffffff814600bf>] ? inet_sendmsg+0x5f/0xb0
[  112.451876]  [<ffffffff813e9e7e>] ? sock_sendmsg+0xee/0x130
[  112.454213]  [<ffffffff810b6869>] ? __do_fault+0x3b9/0x4a0
[  112.456961]  [<ffffffff810a7b48>] ? lru_cache_add_lru+0x28/0x50
[  112.481554]  [<ffffffff813e8769>] ? might_fault+0x9/0x10
[  112.483393]  [<ffffffff813e9964>] ? move_addr_to_user+0x84/0xa0
[  112.485667]  [<ffffffff813ea04d>] __sys_recvmsg+0x13d/0x2b0
[  112.492094]  [<ffffffff8141ff2e>] ? netlink_table_ungrab+0x2e/0x30
[  112.512132]  [<ffffffff8141ffb9>] ? netlink_insert+0x89/0x160
[  112.514165]  [<ffffffff813eae40>] ? move_addr_to_kernel+0x50/0x60
[  112.531071]  [<ffffffff813eb6b4>] ? sys_sendto+0x104/0x140
[  112.541470]  [<ffffffff813e9964>] ? move_addr_to_user+0x84/0xa0
[  112.549085]  [<ffffffff813eb4c2>] ? sys_getsockname+0xa2/0xc0
[  112.569417]  [<ffffffff813ebe14>] sys_recvmsg+0x44/0x90
[  112.571711]  [<ffffffff81002552>] system_call_fastpath+0x16/0x1b
[  112.573894] Code: 00 00 00 e9 4f ff ff ff 0f 1f 80 00 00 00 00 ff 8b d0 00 00 00 48 8b 1a 48 8b 4a 08 48 c7 02 00 00 00 00 48 c7 
42 08 00 00 00 00 <48> 89 4b 08 48 89 19 e9 7c ff ff ff 31 c0 87 87 64 01 00 00 f7
[  112.592492] RIP  [<ffffffff813f5a34>] __skb_recv_datagram+0x124/0x2a0
[  112.603649]  RSP <ffff88031fbd5a88>
[  112.607123] CR2: 0000000000000008
[  112.609064] ---[ end trace f6cbe3b43db03698 ]---

The stack dump isn't always the same, sometimes I'll see

[  236.078335] general protection fault: 0000 [#1] SMP

and the dump shows scsi/blk or xfs or cpu_idle etc. so I don't know how relevant this particular dump is.

What's consistent is that running tcpdump against eth0 while there's tagged traffic arriving on eth0 will kill the kernel.

If I don't run tcpdump, the machine will stay up, but it's not much use if it can't use the network.

On 2.6.36 I needed the patch from http://patchwork.ozlabs.org/patch/69516/ to prevent a similar looking immediate crash on boot. I 
don't have logs from that to compare with and I know the vlan code has changed quite a bit since then.

The same issue has been duplicated on two physically different servers, so hopefully not hardware related. The full boot log from 
this latest attempt is attached.

Iain


Download attachment "2.6.37-64-config.gz" of type "application/gzip" (20981 bytes)

View attachment "2.6.37-vlan.txt" of type "text/plain" (42728 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ