[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <189be8e7-7126-06bf-67bf-53d56ea0723c@seti.kr.ua>
Date: Tue, 5 Feb 2019 22:21:47 +0200
From: Andrew <nitr0@...i.kr.ua>
To: Netdev <netdev@...r.kernel.org>
Subject: Re: Kernel panic in eth_header
On 05.02.2019 21:34, Florian Fainelli wrote:
> On 2/5/19 8:57 AM, Eric Dumazet wrote:
>>
>> On 02/05/2019 08:29 AM, Andrew wrote:
>>> Hi all.
>>>
>>> After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic after a 3 days of uptime.
>>>
>>> Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with debug info enabled (kernel is compiled with same function addresses - I compare vmlinux symbol maps) - it says that panic is in net/ethernet/eth.c:88
>>>
>>> Below there is a kernel panic trace. igb is from vendor, ver. 5.3.5.4. What extra info is needed?
>>>
>>> [263565.106441] BUG: unable to handle kernel paging request at ffff88015a4d2dd4
>>> [263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>>> [263565.119030] PGD 1e8f067 [263565.121474] PUD 0
>>> [263565.123580]
>>> [263565.125166] Oops: 0002 [#1] SMP
>>> [263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
>>> [263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 4.9.153-x86_64 #1
>>> [263565.183996] Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
>>> [263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
>>> [263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>>> [263565.209225] RSP: 0018:ffff88007fa83c58 EFLAGS: 00010286
>>> [263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: ffff8800682434a0
>>> [263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: ffff880077aab000
>>> [263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 0000000000000574
>>> [263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: ffff8800682434a0
>>> [263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 0000000000000008
>>> [263565.250719] FS: 0000000000000000(0000) GS:ffff88007fa80000(0000) knlGS:0000000000000000
>>> [263565.258894] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 00000000000006f0
>>> [263565.271944] Stack:
>>> [263565.274041] ffff880077aab000 ffff880068243400 ffff88007a745000 ffff8800682434a0
>>> [263565.281582] 0000000000000002 ffffffff81571d09 ffff880068243400 ffff88007fa83d00
>>> [263565.289121] ffff88007a745000 ffff880077aab000 ffff88007a712000 ffffffff815a8c61
>>> [263565.296661] Call Trace:
>>> [263565.299193] <IRQ> [263565.301205] [<ffffffff81571d09>] ? neigh_connected_output+0xa9/0x100
>>> [263565.307740] [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
>>> [263565.313920] [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
>>> [263565.319319] [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
>>> [263565.324631] [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
>>> [263565.330030] [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
>>> [263565.336556] [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
>>> [263565.342128] [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
>>> [263565.348134] [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
>>> [263565.353361] [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
>>> [263565.359888] [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
>>> [263565.366586] [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
>>> [263565.373373] [<ffffffff81567632>] ? process_backlog+0x92/0x130
>>> [263565.379291] [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
>>> [263565.385124] [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
>>> [263565.390784] [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
>>> [263565.396008] [<ffffffff81626cd6>] ? call_function_single_interrupt+0x96/0xa0
>>> [263565.403141] <EOI> [263565.405153] [<ffffffff81623eb0>] ? __sched_text_end+0x2/0x2
>>> [263565.410907] [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
>>> [263565.416741] [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
>>> [263565.422314] [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
>>> [263565.428492] [<ffffffff8104c261>] ? start_secondary+0x161/0x180
>>> [263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
>>> [263565.454534] RIP [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>>> [263565.460124] RSP <ffff88007fa83c58>
>>> [263565.463696] CR2: ffff88015a4d2dd4
>>> [263565.467104] ---[ end trace a1bcaf3618724adf ]---
>>> [263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
>>> [263565.478245] Kernel Offset: disabled
>>> [263565.481818] Rebooting in 5 seconds..
>>>
>>
>> This is a well known issue, a fix should come shortly in stable branches
> Is Peter or yourself doing the backport? David would only take care of
> the most two recent stable kernels.
>
> Sorry about missing that change as part of the fragmenstack backport to
> 4.9...
I think that backport will be trivial - at least patch lays smoothly on
4.9 (just with offsets difference).
I'll test it.
Btw, maybe there's a some test conditions to quickly check if patch
helps? Crash is reproducible with unpredictable interval (tens of hours
of quite heavy load).
>> diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
>> index f8bbd693c19c247e41839c2d0b5318ca51b23ee8..d95b32af4a0e3f552405c9e61cc372729834160c 100644
>> --- a/net/ipv4/ip_fragment.c
>> +++ b/net/ipv4/ip_fragment.c
>> @@ -425,6 +425,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
>> * fragment.
>> */
>>
>> + err = -EINVAL;
>> /* Find out where to put this fragment. */
>> prev_tail = qp->q.fragments_tail;
>> if (!prev_tail)
>> @@ -501,7 +502,6 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
>>
>> discard_qp:
>> inet_frag_kill(&qp->q);
>> - err = -EINVAL;
>> __IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
>> err:
>> kfree_skb(skb);
>>
>>
>>
>
Powered by blists - more mailing lists