netdev - Re: Fw: [Bug 86851] New: Reproducible panic on heavy UDP traffic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <544D8386.9030609@redhat.com>
Date:	Mon, 27 Oct 2014 00:28:06 +0100
From:	Nikolay Aleksandrov <nikolay@...hat.com>
To:	Florian Westphal <fw@...len.de>,
	Stephen Hemminger <stephen@...workplumber.org>
CC:	netdev@...r.kernel.org
Subject: Re: Fw: [Bug 86851] New: Reproducible panic on heavy UDP traffic

On 10/25/2014 11:44 PM, Florian Westphal wrote:
> Stephen Hemminger <stephen@...workplumber.org> wrote:
> 
> [ CC Nik ]
> 
>> Date: Fri, 24 Oct 2014 11:34:08 -0700
>> From: "bugzilla-daemon@...zilla.kernel.org" <bugzilla-daemon@...zilla.kernel.org>
>> To: "stephen@...workplumber.org" <stephen@...workplumber.org>
>> Subject: [Bug 86851] New: Reproducible panic on heavy UDP traffic
>>
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=86851
>>
>>             Bug ID: 86851
>>            Summary: Reproducible panic on heavy UDP traffic
>>            Product: Networking
>>            Version: 2.5
>>     Kernel Version: 3.18-rc1
>>           Hardware: x86-64
>>                 OS: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: IPV4
>>           Assignee: shemminger@...ux-foundation.org
>>           Reporter: chutzpah@...too.org
>>         Regression: No
>>
>> Created attachment 154861
>>   --> https://bugzilla.kernel.org/attachment.cgi?id=154861&action=edit
>> Panic message captured over serial console
> 
>  general protection fault: 0000 [#1] SMP
>  Modules linked in: nfs [..]
>  CPU: 7 PID: 257 Comm: kworker/7:1 Tainted: G        W      3.18.0-rc1-base-7+ #2
> 
> asked reporter to check if there is a warning before the oops.
> 
>  Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014
>  Workqueue: events inet_frag_worker
>  task: ffff882fd32e70e0 ti: ffff882fd0adc000 task.ti: ffff882fd0adc000
>  RIP: 0010:[<ffffffff81592ab4>]  [<ffffffff81592ab4>] inet_evict_bucket+0xf4/0x180
>  RSP: 0018:ffff882fd0adfd58  EFLAGS: 00010286
>  RAX: ffff8817c7230701 RBX: dead000000100100 RCX: 0000000180300013
> 
> Hello LIST_POISON!
> 
>  RDX: 0000000180300014 RSI: 0000000000000001 RDI: dead0000001000c0
>  RBP: 0000000000000002 R08: 0000000000000202 R09: ffff88303fc39ab0
>  R10: ffffffff81592ac0 R11: ffffea005f1c8c00 R12: ffffffff81aa2820
>  R13: ffff882fd0adfd70 R14: ffff8817c72307e0 R15: 0000000000000000
>  FS:  0000000000000000(0000) GS:ffff88303fc20000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  device rack0a left promiscuous mode
>  CR2: 00007f054c7ba034 CR3: 0000002fc4986000 CR4: 00000000001407e0
>  Stack:
>   ffffffff81aa3298 ffffffff81aa3290 ffff8817d0820a08 0000000000000000
>   0000000000000000 00000000000000a8 0000000000000008 ffff88303fc32780
>   ffffffff81aa6820 0000000000000059[ 2415.026338] device rack1a left promiscuous mode
> 
>   0000000000000000 ffffffff81592ba2
>  Call Trace:
>   [<ffffffff81592ba2>] ? inet_frag_worker+0x62/0x210
>   [<ffffffff8112c312>] ? process_one_work+0x132/0x360
> [..]
> crash is in hlist_for_each_entry_safe() at the end of inet_evict_bucket(), looks like
> we encounter an already-list_del'd element while iterating.
> 
> Will look at this tomorrow.
> 

Thanks for CCing me.
I'll dig in the code tomorrow but my first thought when I saw this was
could it be possible that we have a race condition between
ip_frag_queue() and inet_frag_evict(), more precisely between the
ipq_kill() calls from ip_frag_queue and inet_frag_evict since the frag
could be found before we have entered the evictor which then can add it to
its expire list but the ipq_kill() from ip_frag_queue() can do a list_del
after we release the chain lock in the evictor so we may end up like this ?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html