[<prev] [next>] [day] [month] [year] [list]
Message-ID: <1324079035.93669.YahooMailClassic@web84519.mail.ne1.yahoo.com>
Date: Fri, 16 Dec 2011 15:43:55 -0800 (PST)
From: d.stussy@...oo.com
To: linux-kernel@...r.kernel.org
Subject: Kernel non-fatal(?) bug - IPsec - Holding atomic when calling scheduler.
I'm getting a whole bunch of these (hostname deleted):
Dec 5 11:09:27 - kernel: BUG: scheduling while atomic:
named/909/0x00000200
Dec 5 11:09:27 - kernel: Modules linked in: xt_geoip ipt_set
ip_set_nethash ip_set xt_recent xt_TARPIT compat_xtables
Dec 5 11:09:27 - kernel: Pid: 909, comm: named Not tainted 3.1.4 #6
Dec 5 11:09:27 - kernel: Call Trace:
Dec 5 11:09:27 - kernel: [<ffffffff8142b633>] ? __schedule+0x5d3/0x7a0
Dec 5 11:09:27 - kernel: [<ffffffff8102ff3d>] ?
select_task_rq_fair+0x3ad/0x800
Dec 5 11:09:28 - kernel: [<ffffffff8102f680>] ?
check_preempt_wakeup+0xe0/0x140
Dec 5 11:09:28 - kernel: [<ffffffff8142bd2d>] ?
schedule_timeout+0x1bd/0x220
Dec 5 11:09:28 - kernel: [<ffffffff8142aeca>] ?
wait_for_common+0xda/0x190
Dec 5 11:09:28 - kernel: [<ffffffff81034cd0>] ?
try_to_wake_up+0x260/0x260
Dec 5 11:09:28 - kernel: [<ffffffff81306b75>] ?
flow_cache_flush+0x75/0x90
Dec 5 11:09:28 - kernel: [<ffffffff81383a8b>] ?
__xfrm_garbage_collect+0xb/0x90
Dec 5 11:09:28 - kernel: [<ffffffff813c2171>] ?
xfrm6_garbage_collect+0x11/0x30
Dec 5 11:09:28 - kernel: [<ffffffff812fc79b>] ? dst_alloc+0x13b/0x170
Dec 5 11:09:28 - kernel: [<ffffffff81387c47>] ?
xfrm_bundle_lookup+0x287/0x3d0
Dec 5 11:09:28 - kernel: [<ffffffff81306929>] ?
flow_cache_lookup+0x259/0x430
Dec 5 11:09:28 - kernel: [<ffffffff813879c0>] ?
xfrm_policy_lookup_bytype.clone.42+0x250/0x250
Dec 5 11:09:28 - kernel: [<ffffffff81386ef8>] ? xfrm_lookup+0x238/0x4d0
Dec 5 11:09:28 - kernel: [<ffffffff813997d8>] ?
ip6_sk_dst_lookup_flow+0xe8/0x170
...
After this point, the call chain varies, and so does the process causing
it. After about 1000 of such reports, the system usually crashes.
Prior to activating IPsec, I did not see these problems. I have both IPv4 and IPv6 stacks on an x86-64 bit system. I have ipsec-tools v0.8.0 as the user interface to IPsec. I am using only transport mode, although tunnel mode is available (an unloaded module) and the user program does present a listener on UDP 4500 (as well as 500). I have set all 3 IPsec options (ah, esp, and comp) to optional ("use" if available).
I see this ONLY when the kernel finds an IPsec policy statement (SPD) where there is no corresponding IPsec authorization definition (SAD) and therefore it is presumedly calling the userspace process via the PF_KEY interface to contact the remote side to do IPsec key exchange (IKE) via UDP port 500 (or 4500 if I had tunnel mode defined). The problem does NOT surface when I have IPsec not loaded/compiled or when I do but the SPD table is empty. If the SPD has policies which define none or discard, the issue does seem to happen.
I have seen this bug with kernel releases 3.1.4 and 3.1.5. It may exist prior to those but I wasn't actively using IPsec before then.
Fix - guide to solution: What atomic lock is being held when the scheduler is called?
Please fix this soon. After a few hundred of these, the kernel seems to get sufficiently confused that it crashes and I have to hard-reset the machine.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists