[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de4b28603ef7bcaea7aeca73dab80079@visp.net.lb>
Date: Fri, 17 Jul 2015 21:16:14 +0300
From: Denys Fedoryshchenko <nuclearcat@...learcat.com>
To: Dan Williams <dcbw@...hat.com>
Cc: Netdev <netdev@...r.kernel.org>, ebiederm@...ssion.com,
davem@...emloft.net, simon@...nz.org.uk, develop@...stov.de
Subject: Re: 4.1.0, kernel panic, pppoe_release
Probably my knowledge of kernel is not sufficient, but i will try few
approaches.
One of them to add to pppoe_unbind_sock_work:
pppox_unbind_sock(sk);
+/* Signal the death of the socket. */
+sk->sk_state = PPPOX_DEAD;
I will wait first, to make sure this patch was causing kernel panic (it
needs 24h testing cycle), then i will try this fix.
On 2015-07-17 18:36, Dan Williams wrote:
> On Fri, 2015-07-17 at 12:24 +0300, Denys Fedoryshchenko wrote:
>> As i suspect, this kernel panic caused by recent changes to pppoe.
>> This problem appearing in accel-pppd (server), on loaded servers (2k
>> users and more).
>> Most probably related to changed "pppoe: Use workqueue to die properly
>> when a PADT is received"
>> I will try to reverse this and related patches.
>
> While I didn't write the patch, I'm the one that started the process
> that got it submitted... Could you review the patch quickly too to see
> if you can spot anything amiss with it, so that it could get fixed up?
> The original patch does fix a real problem so ideally we don't have to
> revert the whole thing upstream.
>
> Dan
>
>> On 2015-07-14 13:57, Denys Fedoryshchenko wrote:
>> > Here is panic message from netconsole. Please let me know if any
>> > additional information required.
>> >
>> > Jul 14 13:49:16 10.0.252.10 [76078.867822] BUG: unable to handle kernel
>> > Jul 14 13:49:16 10.0.252.10 NULL pointer dereference
>> > Jul 14 13:49:16 10.0.252.10 at 00000000000003f0
>> > Jul 14 13:49:16 10.0.252.10 [76078.868280] IP:
>> > Jul 14 13:49:16 10.0.252.10 [<ffffffffa011e12a>]
>> > pppoe_release+0x56/0x142 [pppoe]
>> > Jul 14 13:49:16 10.0.252.10 [76078.868541] PGD 336e4a067
>> > Jul 14 13:49:16 10.0.252.10 PUD 333f17067
>> > Jul 14 13:49:16 10.0.252.10 PMD 0
>> > Jul 14 13:49:16 10.0.252.10
>> > Jul 14 13:49:16 10.0.252.10 [76078.868918] Oops: 0000 [#1]
>> > Jul 14 13:49:16 10.0.252.10 SMP
>> > Jul 14 13:49:16 10.0.252.10
>> > Jul 14 13:49:16 10.0.252.10 [76078.869226] Modules linked in:
>> > Jul 14 13:49:16 10.0.252.10 netconsole
>> > Jul 14 13:49:16 10.0.252.10 configfs
>> > Jul 14 13:49:16 10.0.252.10 coretemp
>> > Jul 14 13:49:16 10.0.252.10 sch_fq
>> > Jul 14 13:49:16 10.0.252.10 cls_fw
>> > Jul 14 13:49:16 10.0.252.10 act_police
>> > Jul 14 13:49:16 10.0.252.10 cls_u32
>> > Jul 14 13:49:16 10.0.252.10 sch_ingress
>> > Jul 14 13:49:16 10.0.252.10 sch_sfq
>> > Jul 14 13:49:16 10.0.252.10 sch_htb
>> > Jul 14 13:49:16 10.0.252.10 pppoe
>> > Jul 14 13:49:16 10.0.252.10 pppox
>> > Jul 14 13:49:16 10.0.252.10 ppp_generic
>> > Jul 14 13:49:16 10.0.252.10 slhc
>> > Jul 14 13:49:16 10.0.252.10 nf_nat_pptp
>> > Jul 14 13:49:16 10.0.252.10 nf_nat_proto_gre
>> > Jul 14 13:49:16 10.0.252.10 nf_conntrack_pptp
>> > Jul 14 13:49:16 10.0.252.10 nf_conntrack_proto_gre
>> > Jul 14 13:49:16 10.0.252.10 tun
>> > Jul 14 13:49:16 10.0.252.10 xt_REDIRECT
>> > Jul 14 13:49:16 10.0.252.10 nf_nat_redirect
>> > Jul 14 13:49:16 10.0.252.10 xt_set
>> > Jul 14 13:49:16 10.0.252.10 xt_TCPMSS
>> > Jul 14 13:49:16 10.0.252.10 ipt_REJECT
>> > Jul 14 13:49:16 10.0.252.10 nf_reject_ipv4
>> > Jul 14 13:49:16 10.0.252.10 ts_bm
>> > Jul 14 13:49:16 10.0.252.10 xt_string
>> > Jul 14 13:49:16 10.0.252.10 xt_connmark
>> > Jul 14 13:49:16 10.0.252.10 xt_DSCP
>> > Jul 14 13:49:16 10.0.252.10 xt_mark
>> > Jul 14 13:49:16 10.0.252.10 xt_tcpudp
>> > Jul 14 13:49:16 10.0.252.10 iptable_mangle
>> > Jul 14 13:49:16 10.0.252.10 iptable_filter
>> > Jul 14 13:49:16 10.0.252.10 iptable_nat
>> > Jul 14 13:49:16 10.0.252.10 nf_conntrack_ipv4
>> > Jul 14 13:49:16 10.0.252.10 nf_defrag_ipv4
>> > Jul 14 13:49:16 10.0.252.10 nf_nat_ipv4
>> > Jul 14 13:49:16 10.0.252.10 nf_nat
>> > Jul 14 13:49:16 10.0.252.10 nf_conntrack
>> > Jul 14 13:49:16 10.0.252.10 ip_tables
>> > Jul 14 13:49:16 10.0.252.10 x_tables
>> > Jul 14 13:49:16 10.0.252.10 ip_set_hash_ip
>> > Jul 14 13:49:16 10.0.252.10 ip_set
>> > Jul 14 13:49:16 10.0.252.10 nfnetlink
>> > Jul 14 13:49:16 10.0.252.10 8021q
>> > Jul 14 13:49:16 10.0.252.10 garp
>> > Jul 14 13:49:16 10.0.252.10 mrp
>> > Jul 14 13:49:16 10.0.252.10 stp
>> > Jul 14 13:49:16 10.0.252.10 llc
>> > Jul 14 13:49:16 10.0.252.10 [last unloaded: netconsole]
>> > Jul 14 13:49:16 10.0.252.10
>> > Jul 14 13:49:16 10.0.252.10 [76078.873195] CPU: 3 PID: 2940 Comm:
>> > accel-pppd Not tainted 4.1.0-build-0074 #7
>> > Jul 14 13:49:16 10.0.252.10 [76078.873396] Hardware name: HP ProLiant
>> > DL320e Gen8 v2, BIOS P80 04/02/2015
>> > Jul 14 13:49:16 10.0.252.10 [76078.873598] task: ffff8800b1886ba0 ti:
>> > ffff8800b09f4000 task.ti: ffff8800b09f4000
>> > Jul 14 13:49:16 10.0.252.10 [76078.873929] RIP:
>> > 0010:[<ffffffffa011e12a>]
>> > Jul 14 13:49:16 10.0.252.10 [<ffffffffa011e12a>]
>> > pppoe_release+0x56/0x142 [pppoe]
>> > Jul 14 13:49:16 10.0.252.10 [76078.874317] RSP: 0018:ffff8800b09f7e28
>> > EFLAGS: 00010202
>> > Jul 14 13:49:16 10.0.252.10 [76078.874512] RAX: 0000000000000000 RBX:
>> > ffff88032a214400 RCX: 0000000000000000
>> > Jul 14 13:49:16 10.0.252.10 [76078.874709] RDX: 000000000000000d RSI:
>> > 00000000fffffe01 RDI: ffffffff8180d6da
>> > Jul 14 13:49:16 10.0.252.10 [76078.874906] RBP: ffff8800b09f7e68 R08:
>> > 0000000000000000 R09: 0000000000000000
>> > Jul 14 13:49:16 10.0.252.10 [76078.875102] R10: ffff88031ef6a110 R11:
>> > 0000000000000293 R12: ffff88030f8d8fc0
>> > Jul 14 13:49:16 10.0.252.10 [76078.875299] R13: ffff88030f8d8ff0 R14:
>> > ffff88033115ee40 R15: ffff8803394e4920
>> > Jul 14 13:49:16 10.0.252.10 [76078.875499] FS: 00007f79b602c700(0000)
>> > GS:ffff880347460000(0000) knlGS:0000000000000000
>> > Jul 14 13:49:16 10.0.252.10 [76078.875837] CS: 0010 DS: 0000 ES: 0000
>> > CR0: 0000000080050033
>> > Jul 14 13:49:16 10.0.252.10 [76078.876036] CR2: 00000000000003f0 CR3:
>> > 0000000335425000 CR4: 00000000001407e0
>> > Jul 14 13:49:16 10.0.252.10 [76078.876239] Stack:
>> > Jul 14 13:49:16 10.0.252.10 [76078.876434] ffff88033ac45c80
>> > Jul 14 13:49:16 10.0.252.10 0000000000000000
>> > Jul 14 13:49:16 10.0.252.10 0000000100000000
>> > Jul 14 13:49:16 10.0.252.10 ffff88030f8d8fc0
>> > Jul 14 13:49:16 10.0.252.10
>> > Jul 14 13:49:16 10.0.252.10 [76078.877001] ffffffffa0120260
>> > Jul 14 13:49:16 10.0.252.10 ffff88030f8d8ff0
>> > Jul 14 13:49:16 10.0.252.10 ffff88033115ee40
>> > Jul 14 13:49:16 10.0.252.10 ffff8803394e4920
>> > Jul 14 13:49:16 10.0.252.10
>> > Jul 14 13:49:16 10.0.252.10 [76078.877564] ffff8800b09f7e88
>> > Jul 14 13:49:16 10.0.252.10 ffffffff81809e2e
>> > Jul 14 13:49:16 10.0.252.10 ffff88031ef6a100
>> > Jul 14 13:49:16 10.0.252.10 0000000000000008
>> > Jul 14 13:49:16 10.0.252.10
>> > Jul 14 13:49:16 10.0.252.10 [76078.878128] Call Trace:
>> > Jul 14 13:49:16 10.0.252.10 [76078.878327] [<ffffffff81809e2e>]
>> > sock_release+0x1a/0x78
>> > Jul 14 13:49:16 10.0.252.10 [76078.878528] [<ffffffff81809e99>]
>> > sock_close+0xd/0x11
>> > Jul 14 13:49:16 10.0.252.10 [76078.878728] [<ffffffff81150395>]
>> > __fput+0xdf/0x193
>> > Jul 14 13:49:16 10.0.252.10 [76078.878926] [<ffffffff81150477>]
>> > ____fput+0x9/0xb
>> > Jul 14 13:49:16 10.0.252.10 [76078.879124] [<ffffffff810cfa95>]
>> > task_work_run+0x85/0x9c
>> > Jul 14 13:49:16 10.0.252.10 [76078.879326] [<ffffffff81002979>]
>> > do_notify_resume+0x40/0x4e
>> > Jul 14 13:49:16 10.0.252.10 [76078.879527] [<ffffffff818a4a0a>]
>> > int_signal+0x12/0x17
>> > Jul 14 13:49:16 10.0.252.10 [76078.879726] Code:
>> > Jul 14 13:49:16 10.0.252.10 48
>> > Jul 14 13:49:16 10.0.252.10 8b
>> > Jul 14 13:49:16 10.0.252.10 83
>> > Jul 14 13:49:16 10.0.252.10 e0
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 a8
>> > Jul 14 13:49:16 10.0.252.10 01
>> > Jul 14 13:49:16 10.0.252.10 74
>> > Jul 14 13:49:16 10.0.252.10 12
>> > Jul 14 13:49:16 10.0.252.10 48
>> > Jul 14 13:49:16 10.0.252.10 89
>> > Jul 14 13:49:16 10.0.252.10 df
>> > Jul 14 13:49:16 10.0.252.10 e8
>> > Jul 14 13:49:16 10.0.252.10 87
>> > Jul 14 13:49:16 10.0.252.10 f9
>> > Jul 14 13:49:16 10.0.252.10 6e
>> > Jul 14 13:49:16 10.0.252.10 e1
>> > Jul 14 13:49:16 10.0.252.10 b8
>> > Jul 14 13:49:16 10.0.252.10 f7
>> > Jul 14 13:49:16 10.0.252.10 ff
>> > Jul 14 13:49:16 10.0.252.10 ff
>> > Jul 14 13:49:16 10.0.252.10 ff
>> > Jul 14 13:49:16 10.0.252.10 e9
>> > Jul 14 13:49:16 10.0.252.10 eb
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 8a
>> > Jul 14 13:49:16 10.0.252.10 43
>> > Jul 14 13:49:16 10.0.252.10 12
>> > Jul 14 13:49:16 10.0.252.10 a8
>> > Jul 14 13:49:16 10.0.252.10 0b
>> > Jul 14 13:49:16 10.0.252.10 74
>> > Jul 14 13:49:16 10.0.252.10 1c
>> > Jul 14 13:49:16 10.0.252.10 48
>> > Jul 14 13:49:16 10.0.252.10 8b
>> > Jul 14 13:49:16 10.0.252.10 83
>> > Jul 14 13:49:16 10.0.252.10 b0
>> > Jul 14 13:49:16 10.0.252.10 02
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10
>> > Jul 14 13:49:16 10.0.252.10 8b
>> > Jul 14 13:49:16 10.0.252.10 80
>> > Jul 14 13:49:16 10.0.252.10 f0
>> > Jul 14 13:49:16 10.0.252.10 03
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 65
>> > Jul 14 13:49:16 10.0.252.10 ff
>> > Jul 14 13:49:16 10.0.252.10 08
>> > Jul 14 13:49:16 10.0.252.10 48
>> > Jul 14 13:49:16 10.0.252.10 c7
>> > Jul 14 13:49:16 10.0.252.10 83
>> > Jul 14 13:49:16 10.0.252.10 b0
>> > Jul 14 13:49:16 10.0.252.10 02
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10 00
>> > Jul 14 13:49:16 10.0.252.10
>> > Jul 14 13:49:16 10.0.252.10 [76078.883913] RIP
>> > Jul 14 13:49:16 10.0.252.10 [<ffffffffa011e12a>]
>> > pppoe_release+0x56/0x142 [pppoe]
>> > Jul 14 13:49:16 10.0.252.10 [76078.884171] RSP <ffff8800b09f7e28>
>> > Jul 14 13:49:16 10.0.252.10 [76078.884368] CR2: 00000000000003f0
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.867822] BUG: unable to
>> > handle kernel NULL pointer dereference at 00000000000003f0
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.868280] IP:
>> > [<ffffffffa011e12a>] pppoe_release+0x56/0x142 [pppoe]
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.868541] PGD 336e4a067 PUD
>> > 333f17067 PMD 0
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.868918] Oops: 0000 [#1] SMP
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.869226] Modules linked in:
>> > netconsole configfs coretemp sch_fq cls_fw act_police cls_u32
>> > sch_ingress sch_sfq sch_htb pppoe pppox ppp_generic slhc nf_nat_pptp
>> > nf_nat_proto_gre nf_conntrack_pptp nf_conntrack_proto_gre tun
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.873195] CPU: 3 PID: 2940
>> > Comm: accel-pppd Not tainted 4.1.0-build-0074 #7
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.873396] Hardware name: HP
>> > ProLiant DL320e Gen8 v2, BIOS P80 04/02/2015
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.873598] task:
>> > ffff8800b1886ba0 ti: ffff8800b09f4000 task.ti: ffff8800b09f4000
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.873929] RIP:
>> > 0010:[<ffffffffa011e12a>] [<ffffffffa011e12a>]
>> > pppoe_release+0x56/0x142 [pppoe]
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.874317] RSP:
>> > 0018:ffff8800b09f7e28 EFLAGS: 00010202
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.874512] RAX:
>> > 0000000000000000 RBX: ffff88032a214400 RCX: 0000000000000000
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.874709] RDX:
>> > 000000000000000d RSI: 00000000fffffe01 RDI: ffffffff8180d6da
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.874906] RBP:
>> > ffff8800b09f7e68 R08: 0000000000000000 R09: 0000000000000000
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.875102] R10:
>> > ffff88031ef6a110 R11: 0000000000000293 R12: ffff88030f8d8fc0
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.875299] R13:
>> > ffff88030f8d8ff0 R14: ffff88033115ee40 R15: ffff8803394e4920
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.875499] FS:
>> > 00007f79b602c700(0000) GS:ffff880347460000(0000)
>> > knlGS:0000000000000000
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.875837] CS: 0010 DS: 0000
>> > ES: 0000 CR0: 0000000080050033
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.876036] CR2:
>> > 00000000000003f0 CR3: 0000000335425000 CR4: 00000000001407e0
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.876239] Stack:
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.876434] ffff88033ac45c80
>> > 0000000000000000 0000000100000000 ffff88030f8d8fc0
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.877001] ffffffffa0120260
>> > ffff88030f8d8ff0 ffff88033115ee40 ffff8803394e4920
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.877564] ffff8800b09f7e88
>> > ffffffff81809e2e ffff88031ef6a100 0000000000000008
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.878128] Call Trace:
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.878327]
>> > [<ffffffff81809e2e>] sock_release+0x1a/0x78
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.878528]
>> > [<ffffffff81809e99>] sock_close+0xd/0x11
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.878728]
>> > [<ffffffff81150395>] __fput+0xdf/0x193
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.878926]
>> > [<ffffffff81150477>] ____fput+0x9/0xb
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.879124]
>> > [<ffffffff810cfa95>] task_work_run+0x85/0x9c
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.879326]
>> > [<ffffffff81002979>] do_notify_resume+0x40/0x4e
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.879527]
>> > [<ffffffff818a4a0a>] int_signal+0x12/0x17
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.879726] Code: 48 8b 83 e0
>> > 00 00 00 a8 01 74 12 48 89 df e8 87 f9 6e e1 b8 f7 ff ff ff e9 eb 00
>> > 00 00 8a 43 12 a8 0b 74 1c 48 8b 83 b0 02 00 00 <48> 8b 80 f0 03 00 00
>> > 65 ff 08 48 c7 83 b0 02 00 00 00 00 00 00
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.883913] RIP
>> > [<ffffffffa011e12a>] pppoe_release+0x56/0x142 [pppoe]
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.884171] RSP
>> > <ffff8800b09f7e28>
>> > Jul 14 10:49:16 10.0.252.10 kernel: [76078.884368] CR2:
>> > 00000000000003f0
>> > Jul 14 13:49:16 10.0.252.10 [76078.884972] ---[ end trace
>> > 7fa41f8b4758f1fa ]---
>> > Jul 14 10:49:16 10.0.252.10 accel-pppd: pppoe: discard PADR packet
>> > (incorrect AC-Cookie)
>> > Jul 14 10:49:17 10.0.252.10 kernel: [76078.884972] ---[ end trace
>> > 7fa41f8b4758f1fa ]---
>> > Jul 14 13:49:17 10.0.252.10 [76078.936849] Kernel panic - not syncing:
>> > Fatal exception
>> > Jul 14 13:49:17 10.0.252.10 [76078.937054] Kernel Offset: disabled
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists