[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <90051606-1883-7dc7-fe4f-3bb135e816ae@solarflare.com>
Date: Wed, 30 Jan 2019 17:33:14 +0000
From: Edward Cree <ecree@...arflare.com>
To: Ivan Babrou <ivan@...udflare.com>, <netdev@...r.kernel.org>
CC: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Ignat Korchagin <ignat@...udflare.com>,
Shawn Bohrer <sbohrer@...udflare.com>,
Jakub Sitnicki <jakub@...udflare.com>
Subject: Re: Crashes in skb clone/allocation in 4.19.18
On 30/01/19 16:51, Ivan Babrou wrote:
> Hey,
>
> We've upgraded some machines from 4.19.13 to 4.19.18 and some of them
> crashed with the following:
>
> [ 2313.192006] general protection fault: 0000 [#1] SMP PTI
> [ 2313.205924] CPU: 32 PID: 65437 Comm: nginx-fl Tainted: G
> O 4.19.18-cloudflare-2019.1.8 #2019.1.8
> [ 2313.224973] Hardware name: Quanta Computer Inc. QuantaPlex
> T41S-2U/S2S-MB, BIOS S2S_3B10.03 06/21/2018
> [ 2313.243400] RIP: 0010:kmem_cache_alloc_node+0x178/0x1f0
> [ 2313.257768] Code: 89 fa 4c 89 f6 e8 68 40 a1 00 4c 8b 55 00 58 4d
> 85 d2 75 d6 e9 6f ff ff ff 41 8b 59 20 48 8d 4a 01 4c 89 f8 49 8b 39
> 4c 01 fb <48> 33 1b 49 33 99 38 01 00 00 65 48 0f c7 0f 0f 94 c0 84 c0
> 0f 84
> [ 2313.295550] RSP: 0000:ffff94457f903b48 EFLAGS: 00010202
> [ 2313.310352] RAX: 08b82daf1f57da0e RBX: 08b82daf1f57da0e RCX: 00000000005ff72d
> [ 2313.327189] RDX: 00000000005ff72c RSI: 0000000000480220 RDI: 0000000000026e40
> [ 2313.344029] RBP: ffff94457f04d680 R08: ffff94457f926e40 R09: ffff94457f04d680
> [ 2313.360912] R10: 000004ce652a0026 R11: 0000000000000000 R12: 0000000000480220
> [ 2313.377857] R13: 00000000ffffffff R14: ffffffffb1ab3ab7 R15: 08b82daf1f57da0e
> [ 2313.394820] FS: 00007fdea755c780(0000) GS:ffff94457f900000(0000)
> knlGS:0000000000000000
> [ 2313.412887] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2313.428581] CR2: 000055acc3cf517b CR3: 000000201b1ea003 CR4: 00000000003606e0
> [ 2313.445753] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 2313.462843] perf: interrupt took too long (8028 > 7291), lowering
> kernel.perf_event_max_sample_rate to 24000
> [ 2313.462867] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 2313.500216] Call Trace:
> [ 2313.512833] <IRQ>
> [ 2313.524748] __alloc_skb+0x57/0x1d0
> [ 2313.537934] __tcp_send_ack.part.48+0x2f/0x100
> [ 2313.551845] tcp_rcv_established+0x550/0x640
> [ 2313.565394] tcp_v4_do_rcv+0x12a/0x1e0
> [ 2313.578322] tcp_v4_rcv+0xadc/0xbd0
> [ 2313.590993] ip_local_deliver_finish+0x5d/0x1d0
> [ 2313.604727] ip_local_deliver+0x6b/0xe0
> [ 2313.617782] ? ip_sublist_rcv+0x200/0x200
> [ 2313.630415] perf: interrupt took too long (10040 > 10035), lowering
> kernel.perf_event_max_sample_rate to 19000
> [ 2313.630948] ip_rcv+0x52/0xd0
> [ 2313.662850] ? ip_rcv_core.isra.22+0x2b0/0x2b0
> [ 2313.662857] __netif_receive_skb_one_core+0x52/0x70
> [ 2313.690860] netif_receive_skb_internal+0x34/0xe0
> [ 2313.690883] efx_rx_deliver+0x11a/0x180 [sfc]
> [ 2313.717780] ? __efx_rx_packet+0x1ef/0x730 [sfc]
> [ 2313.717786] ? __queue_work+0x103/0x3e0
> [ 2313.743118] ? efx_poll+0x35e/0x460 [sfc]
> [ 2313.743125] ? net_rx_action+0x138/0x360
> [ 2313.767356] ? __do_softirq+0xd8/0x2d2
> [ 2313.767362] ? irq_exit+0xb4/0xc0
> [ 2313.790680] ? do_IRQ+0x85/0xd0
> [ 2313.790688] ? common_interrupt+0xf/0xf
> [ 2313.790694] </IRQ>
Something odd is going on. As far as I can tell from this call trace
(which has some weirdness in it; any chance you could reproduce with
frame pointers or a lower build optimisation level?) you're in the
normal sfc receive path (under efx_process_channel(), although that's
one of the functions that hasn't made it into the stack trace), which
means you should have a channel->rx_list, and thus efx_rx_deliver()
should be putting the packet on that list rather than calling
netif_receive_skb().
I don't know how, or if, that could be related to the crash you're
getting, but it might be worth looking into.
(It can't be the whole story, as your other crash is on a mlx5e and
AFAIK they don't use list-RX yet. Though, confusingly, an entry for
ip_sublist_rcv still makes it into both stack traces.)
Maybe it's secondary damage from a wild pointer or other mm problem
letting memory get scribbled on.
-Ed
> [ 2313.823837] Modules linked in: tun xt_connlimit nf_conncount xt_bpf
> xt_hashlimit cls_flow cls_u32 sch_htb sch_fq md_mod dm_crypt
> algif_skcipher af_alg dm_mod dax ip6table_nat nf_nat_ipv6
> ip6table_mangle ip6table_security ip6table_raw xt_nat iptable_nat
> nf_nat_ipv4 nf_nat xt_TPROXY nf_tproxy_ipv6 nf_tproxy_ipv4 xt_connmark
> iptable_mangle xt_owner xt_CT xt_socket nf_socket_ipv4 nf_socket_ipv6
> iptable_raw ip6table_filter ip6_tables nfnetlink_log xt_NFLOG
> xt_tcpudp xt_comment xt_conntrack nf_conntrack nf_defrag_ipv6
> nf_defrag_ipv4 xt_mark xt_multiport xt_set iptable_filter bpfilter
> ip_set_hash_netport ip_set_hash_net ip_set_hash_ip ip_set nfnetlink
> 8021q garp mrp stp llc sb_edac x86_pkg_temp_thermal kvm_intel kvm
> irqbypass crc32_pclmul crc32c_intel pcbc aesni_intel aes_x86_64
> ipmi_ssif crypto_simd cryptd
> [ 2313.952153] sfc(O) glue_helper igb i2c_algo_bit ipmi_si mdio dca
> ipmi_devintf ipmi_msghandler efivarfs ip_tables x_tables
> [ 2313.952238] ---[ end trace 477d8e3081c605f6 ]---
>
> Some nodes also crashed in skb_clone, rather than __alloc_skb:
>
> [ 3810.686137] general protection fault: 0000 [#1] SMP PTI
> [ 3810.694579] CPU: 64 PID: 69338 Comm: nginx-fl Not tainted
> 4.19.18-cloudflare-2019.1.8 #2019.1.8
> [ 3810.706589] Hardware name: Quanta Cloud Technology Inc. QuantaPlex
> T42S-2U(LBG-4) ^S5SZ090028/T42S-2U MB (Lewisburg-4), BIOS 3A11.Q10
> 06/29/2018
> [ 3810.726475] RIP: 0010:kmem_cache_alloc+0x89/0x1c0
> [ 3810.734701] Code: 82 72 49 83 78 10 00 4d 8b 30 0f 84 0e 01 00 00
> 4d 85 f6 0f 84 05 01 00 00 41 8b 5f 20 48 8d 4a 01 4c 89 f0 49 8b 3f
> 4c 01 f3 <48> 33 1b 49 33 9f 38 01 00 00 65 48 0f c7 0f 0f 94 c0 84 c0
> 74 b2
> [ 3810.761088] RSP: 0000:ffff99723fe03730 EFLAGS: 00010282
> [ 3810.770132] RAX: f0382d8aebf1ae68 RBX: f0382d8aebf1ae68 RCX: 0000000001cb61cf
> [ 3810.781105] RDX: 0000000001cb61ce RSI: 0000000000480020 RDI: 0000000000027550
> [ 3810.792012] RBP: ffff99723f19d500 R08: ffff99723fe27550 R09: 00000000000005dc
> [ 3810.802820] R10: ffff9992227c0000 R11: 0000000000004000 R12: 0000000000480020
> [ 3810.813589] R13: ffffffff8dcb5f7d R14: f0382d8aebf1ae68 R15: ffff99723f19d500
> [ 3810.824382] FS: 00007f2a8863c780(0000) GS:ffff99723fe00000(0000)
> knlGS:0000000000000000
> [ 3810.836189] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3810.845662] CR2: 000055820762eecd CR3: 00000019eb850003 CR4: 00000000007606e0
> [ 3810.856567] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 3810.867600] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 3810.878554] PKRU: 55555554
> [ 3810.884787] Call Trace:
> [ 3810.890601] <IRQ>
> [ 3810.896116] skb_clone+0x4d/0xb0
> [ 3810.902712] dev_queue_xmit_nit+0xd9/0x260
> [ 3810.910181] dev_hard_start_xmit+0x69/0x1f0
> [ 3810.917784] __dev_queue_xmit+0x6f7/0x8a0
> [ 3810.925172] ? eth_header+0x26/0xc0
> [ 3810.932053] ip_finish_output2+0x193/0x400
> [ 3810.939670] ? ip_finish_output+0x139/0x270
> [ 3810.947241] ip_output+0x6c/0xe0
> [ 3810.953844] ? ip_append_data.part.51+0xc0/0xc0
> [ 3810.961802] __tcp_transmit_skb+0x511/0xaa0
> [ 3810.969420] __tcp_retransmit_skb+0x19c/0x7c0
> [ 3810.977209] ? tcp_current_mss+0x57/0xa0
> [ 3810.984493] tcp_retransmit_skb+0x12/0x80
> [ 3810.991894] tcp_xmit_retransmit_queue.part.50+0x147/0x240
> [ 3811.000754] tcp_ack+0x9c4/0x11b0
> [ 3811.007416] tcp_rcv_established+0x190/0x640
> [ 3811.015065] ? tcp_v4_inbound_md5_hash+0x69/0x160
> [ 3811.023106] tcp_v4_do_rcv+0x12a/0x1e0
> [ 3811.030190] tcp_v4_rcv+0xadc/0xbd0
> [ 3811.037009] ip_local_deliver_finish+0x5d/0x1d0
> [ 3811.044859] ip_local_deliver+0x6b/0xe0
> [ 3811.051999] ? ip_sublist_rcv+0x200/0x200
> [ 3811.059325] ip_rcv+0x52/0xd0
> [ 3811.065595] ? ip_rcv_core.isra.22+0x2b0/0x2b0
> [ 3811.073361] __netif_receive_skb_one_core+0x52/0x70
> [ 3811.081621] netif_receive_skb_internal+0x34/0xe0
> [ 3811.089652] napi_gro_receive+0xba/0xe0
> [ 3811.096969] mlx5e_handle_rx_cqe+0x1eb/0x530 [mlx5_core]
> [ 3811.105545] ? skb_release_head_state+0x5c/0xb0
> [ 3811.113447] mlx5e_poll_rx_cq+0xc8/0x910 [mlx5_core]
> [ 3811.121652] mlx5e_napi_poll+0xb1/0xc60 [mlx5_core]
> [ 3811.129574] net_rx_action+0x138/0x360
> [ 3811.136266] __do_softirq+0xd8/0x2d2
> [ 3811.142679] irq_exit+0xb4/0xc0
> [ 3811.148578] do_IRQ+0x85/0xd0
> [ 3811.154254] common_interrupt+0xf/0xf
> [ 3811.160585] </IRQ>
> [ 3811.165319] RIP: 0033:0x5581e1551ca0
> [ 3811.171546] Code: e8 10 41 ff 24 ee 81 7c ca 04 ff ff fe ff 0f 83
> 87 1c 00 00 8b 03 0f b6 cc 0f b6 e8 83 c3 04 c1 e8 10 41 ff 24 ee 48
> 8b 2c c2 <48> 89 2c ca 8b 03 0f b6 cc 0f b6 e8 83 c3 04 c1 e8 10 41 ff
> 24 ee
> [ 3811.195925] RSP: 002b:00007ffdd615ebc0 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffffde
> [ 3811.206319] RAX: 0000000000000000 RBX: 00000000406c9058 RCX: 000000000000000b
> [ 3811.216321] RDX: 000000004099cdc8 RSI: fffffffb40c07eb0 RDI: 000000004183d738
> [ 3811.226277] RBP: fffffff444c8c5c0 R08: 000000004099cdc8 R09: 00000000425ce3d8
> [ 3811.236340] R10: 0000000044c8c5c0 R11: 000000004139cbb0 R12: 0000000000000000
> [ 3811.246349] R13: 00005581ead6a9e0 R14: 000000004166afe8 R15: 00000000406c90f8
> [ 3811.256320] Modules linked in: tun xt_connlimit nf_conncount xt_bpf
> xt_hashlimit cls_flow cls_u32 sch_htb sch_fq md_mod dm_crypt
> algif_skcipher af_alg dm_mod dax ip6table_nat nf_nat_ipv6
> ip6table_mangle ip6table_security ip6table_raw ip6table_filter
> ip6_tables xt_nat iptable_nat nf_nat_ipv4 nf_nat xt_TPROXY
> nf_tproxy_ipv6 nf_tproxy_ipv4 xt_connmark iptable_mangle xt_owner
> xt_CT xt_socket nf_socket_ipv4 nf_socket_ipv6 iptable_raw
> nfnetlink_log xt_NFLOG xt_tcpudp xt_comment xt_conntrack nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 xt_mark xt_multiport xt_set
> iptable_filter bpfilter ip_set_hash_netport ip_set_hash_net
> ip_set_hash_ip ip_set nfnetlink 8021q garp mrp stp llc skx_edac
> x86_pkg_temp_thermal kvm_intel kvm irqbypass ipmi_ssif crc32_pclmul
> crc32c_intel pcbc aesni_intel aes_x86_64 crypto_simd mlx5_core
> [ 3811.351698] cryptd xhci_pci tpm_crb mlxfw glue_helper ioatdma
> devlink ipmi_si xhci_hcd dca ipmi_devintf ipmi_msghandler tpm_tis
> tpm_tis_core tpm efivarfs ip_tables x_tables
> [ 3811.375161] ---[ end trace 1a7795bb39a63cf7 ]---
>
> Is this know? Could it be related to this commit:
>
> * https://github.com/torvalds/linux/commit/598e57e029290be3e7f8f87ff908091a5a22ed2f
>
> Thanks!
Powered by blists - more mailing lists