lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55E7907D.9000606@iogearbox.net>
Date:	Thu, 03 Sep 2015 02:12:45 +0200
From:	Daniel Borkmann <daniel@...earbox.net>
To:	Shaun Crampton <Shaun.Crampton@...aswitch.com>
CC:	Eric Dumazet <eric.dumazet@...il.com>,
	Michael Marineau <michael.marineau@...eos.com>,
	Chuck Ebbert <cebbert.lkml@...il.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Peter White <Peter.White@...aswitch.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: ip_rcv_finish() NULL pointer and possibly related Oopses

On 09/02/2015 06:39 PM, Shaun Crampton wrote:
>> Make sure you backported commit
>> 10e2eb878f3ca07ac2f05fa5ca5e6c4c9174a27a
>> ("udp: fix dst races with multicast early demux")
>
> I just tried the latest CoreOS alpha, which had that patch.  Sadly, I saw
> just as many reboots.  Here's a sample of the different types of Oopses I
> see (I've put the rest up in a gist:
> https://gist.github.com/fasaxc/d801ced5608f2657abd8):
>
> [ 4024.564479] BUG: unable to handle kernel NULL pointer dereference at
>         (null)
> [ 4024.565452] IP: [<          (null)>]           (null)
> [ 4024.565452] PGD 2297067 PUD 2296067 PMD 0
> [ 4024.565452] Oops: 0010 [#1] SMP
> [ 4024.565452] Modules linked in: xt_mac xt_mark veth ip_set_hash_net
> nf_conntrack_ipv6 nf_defrag_ipv6 xt_comment xt_set ip_set_hash_ip ip_set
> nfnetlink ipip tunnel4 ip_tunnel ip6table_filter ip6_tables xt_conntrack
> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4
> nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter br_netfilter nf_nat
> nf_conntrack bridge stp llc overlay nls_ascii nls_cp437 vfat fat ext4
> crc16 mbcache jbd2 sd_mod crc32c_intel virtio_scsi scsi_mod aesni_intel
> virtio_net mousedev aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd
> microcode firmware_class virtio_pci virtio_ring psmouse virtio i2c_piix4
> i2c_core acpi_cpufreq button evdev sch_fq_codel ip_tables autofs4
> [ 4024.565452] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.6-coreos-r1 #2
> [ 4024.565452] Hardware name: Google Google, BIOS Google 01/01/2011
> [ 4024.565452] task: ffffffff81a154c0 ti: ffffffff81a00000 task.ti:
> ffffffff81a00000
> [ 4024.565452] RIP: 0010:[<0000000000000000>]  [<          (null)>]
>     (null)
> [ 4024.565452] RSP: 0018:ffff88021fc03c00  EFLAGS: 00010246
> [ 4024.565452] RAX: ffff880003375d00 RBX: ffff880003375d00 RCX:
> 0000000000000001
> [ 4024.565452] RDX: ffff88000306c000 RSI: 0000000000000000 RDI:
> ffff880003375d00
> [ 4024.565452] RBP: ffff88021fc03c28 R08: 0000000000005608 R09:
> 000000000000bb84
> [ 4024.565452] R10: 0000000000000003 R11: ffff880215a30dc0 R12:
> ffff880214bfb000
> [ 4024.565452] R13: ffff88000306c000 R14: ffff88000306c000 R15:
> 0000000000000008
> [ 4024.565452] FS:  0000000000000000(0000) GS:ffff88021fc00000(0000)
> knlGS:0000000000000000
> [ 4024.565452] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 4024.565452] CR2: 0000000000000000 CR3: 0000000001d92000 CR4:
> 00000000001406f0
> [ 4024.600761] Stack:
> [ 4024.601081]  ffffffff814ac9dc ffff880000000002 ffff88000306c000
> ffff880003375d00
> [ 4024.601081]  ffff88008cbba84e ffff88021fc03c58 ffffffff81486628
> ffff88021690a000
> [ 4024.601081]  ffff88008cbba84e ffff880003375d00 ffff88000306c000
> ffff88021fc03cb8
> [ 4024.601081] Call Trace:
> [ 4024.601081]  <IRQ>
> [ 4024.601081]  [<ffffffff814ac9dc>] ? tcp_v4_early_demux+0x11c/0x160
> [ 4024.601081]  [<ffffffff81486628>] ip_rcv_finish+0xb8/0x360
> [ 4024.601081]  [<ffffffff81486f84>] ip_rcv+0x2a4/0x400
> [ 4024.601081]  [<ffffffff81486570>] ? inet_del_offload+0x40/0x40
> [ 4024.601081]  [<ffffffff81449053>] __netif_receive_skb_core+0x6c3/0x9a0
> [ 4024.601081]  [<ffffffff8143b507>] ? build_skb+0x17/0x90
> [ 4024.601081]  [<ffffffff81449348>] __netif_receive_skb+0x18/0x60
> [ 4024.601081]  [<ffffffff814493c3>] netif_receive_skb_internal+0x33/0xa0
> [ 4024.601081]  [<ffffffff8144944c>] netif_receive_skb_sk+0x1c/0x70
> [ 4024.601081]  [<ffffffffa008772b>] 0xffffffffa008772b
> [ 4024.601081]  [<ffffffff81096cb0>] ? check_preempt_curr+0x80/0xa0
> [ 4024.601081]  [<ffffffffa0087d81>] 0xffffffffa0087d81

Looking at this one, I am still puzzeled where 0xffffffffa008772b and
0xffffffffa008772b comes from ... some driver, bridge ...? Also the call
to inet_del_offload() seems a bit odd. Even in 4.1, there's only one (buggy)
instance that calls inet_del_offload(), which is ipv6_exthdrs_offload_init(),
but IPPROTO_ROUTING shouldn't have much of an effect on the v4 table as
far as I can see. Maybe rather a false positive that address, hmm? Perhaps
some callback/infrastructure vanished underneath us as ip/rip is both null
... maybe due to that also 0xffffffffa008772b / 0xffffffffa008772b don't
resolve?

> [ 4024.601081]  [<ffffffff81449819>] net_rx_action+0x159/0x340
> [ 4024.601081]  [<ffffffff810715f4>] __do_softirq+0xf4/0x290
> [ 4024.601081]  [<ffffffff810719fd>] irq_exit+0xad/0xc0
> [ 4024.601081]  [<ffffffff815527fa>] do_IRQ+0x5a/0xf0
> [ 4024.601081]  [<ffffffff815506ae>] common_interrupt+0x6e/0x6e
> [ 4024.601081]  <EOI>
> [ 4024.601081]  [<ffffffff81059bd6>] ? native_safe_halt+0x6/0x10
> [ 4024.601081]  [<ffffffff8101f17e>] default_idle+0x1e/0xc0
> [ 4024.601081]  [<ffffffff8101fc5f>] arch_cpu_idle+0xf/0x20
> [ 4024.601081]  [<ffffffff810b0ab4>] cpu_startup_entry+0x314/0x3e0
> [ 4024.601081]  [<ffffffff8153bbec>] rest_init+0x7c/0x80
> [ 4024.601081]  [<ffffffff81b130e0>] start_kernel+0x483/0x490
> [ 4024.601081]  [<ffffffff81b12a4d>] ? set_init_arg+0x55/0x55
> [ 4024.601081]  [<ffffffff81b12120>] ? early_idt_handler_array+0x120/0x120
> [ 4024.601081]  [<ffffffff81b125ee>] x86_64_start_reservations+0x2a/0x2c
> [ 4024.601081]  [<ffffffff81b12728>] x86_64_start_kernel+0x138/0x147
> [ 4024.601081] Code:  Bad RIP value.
> [ 4024.601081] RIP  [<          (null)>]           (null)
> [ 4024.601081]  RSP <ffff88021fc03c00>
> [ 4024.601081] CR2: 0000000000000000
> [ 4024.601081] ---[ end trace cdabfe9d7380aaab ]---
> [ 4024.601081] Kernel panic - not syncing: Fatal exception in interrupt
> [ 4024.601081] Kernel Offset: disabled
> [ 4024.601081] Rebooting in 60 seconds..
> [ 4024.601081] ACPI MEMORY or I/O RESET_REG.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ