[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4896CA3F.3070709@gmail.com>
Date: Mon, 04 Aug 2008 11:22:07 +0200
From: Jiri Slaby <jirislaby@...il.com>
To: Dave Young <hidave.darkstar@...il.com>
CC: Pekka J Enberg <penberg@...helsinki.fi>,
Andrew Morton <akpm@...ux-foundation.org>,
Johannes Berg <johannes@...solutions.net>,
linux-kernel@...r.kernel.org, linux-wireless@...r.kernel.org
Subject: Re: [BUG] wireless : cpu stuck for 61s
Dave Young napsal(a):
> On Thu, Jul 31, 2008 at 5:15 PM, Pekka J Enberg <penberg@...helsinki.fi> wrote:
>> On Wed, 30 Jul 2008, Andrew Morton wrote:
>>> INFO: Allocated in dev_alloc_skb+0x1c/0x30 age=3642 cpu=0 pid=0
>>> INFO: Freed in skb_release_data+0x57/0x80 age=3146 cpu=0 pid=2398
>> So the corrupted object was free'd by skb_release_data() so we need to
>> look for a driver or the networking stack calling that function too early.
>>
>>> INFO: Slab 0xc1c05440 objects=7 used=3 fp=0xf6f3a060 flags=0x400020c3
>>> INFO: Object 0xf6f3a060 @offset=8288 fp=0xf6f39030
>>>
>>> Bytes b4 0xf6f3a050: 5e 09 00 00 57 c9 05 00 5a 5a 5a 5a 5a 5a 5a 5a ^...WÉ..ZZZZZZZZ
>> The object starts here (the poison values for first 32 bytes are okay):
>>
>>> Object 0xf6f3a060: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
>>> Object 0xf6f3a070: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
>> And here are the first 96 bytes of the corruption:
>>
>>> Object 0xf6f3a080: 80 00 00 00 ff ff ff ff ff ff 00 17 7b 00 46 40 ....ÿÿÿÿÿÿ..{.F@
>>> Object 0xf6f3a090: 00 17 7b 00 46 40 30 09 81 21 08 7a 21 00 00 00 ..{.F@...!.z!...
>>> Object 0xf6f3a0a0: 64 00 21 04 00 07 00 00 00 00 00 00 00 01 08 82 d.!.............
>>> Object 0xf6f3a0b0: 84 8b 0c 12 96 18 24 03 01 01 05 04 00 02 00 00 ......$.........
>>> Object 0xf6f3a0c0: 07 06 43 4e 20 01 0d 14 2a 01 00 32 04 30 48 60 ..CN....*..2.0H`
>>> Object 0xf6f3a0d0: 6c dd 18 00 17 7b 01 04 00 00 00 01 00 00 00 10 lÝ...{..........
>> But I think that's just the payload of a SKB?
It's a receive frame from ath5k, I suppose. 00:17:7b:00:46:40 is your AP?
>>> Redzone 0xf6f3b060: bb bb bb bb »»»»
>> The red-zone has SLUB_RED_INACTIVE ("0xbb") which reinforces
>> use-after-free.
>>
>>> Padding 0xf6f3b088: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
>>> Pid: 0, comm: swapper Tainted: G W 2.6.26-smp #2
>>> [<c0180f5d>] print_trailer+0xad/0xf0
>>> [<c018103b>] check_bytes_and_report+0x9b/0xc0
>>> [<c018145e>] check_object+0x19e/0x1e0
>>> [<c01821a4>] __slab_alloc+0x454/0x4f0
>>> [<c01834d6>] __kmalloc_track_caller+0xe6/0xf0
>>> [<c03dd1ec>] ? dev_alloc_skb+0x1c/0x30
>>> [<c03dd1ec>] ? dev_alloc_skb+0x1c/0x30
>>> [<c03dce79>] __alloc_skb+0x49/0x100
>>> [<c03dd1ec>] dev_alloc_skb+0x1c/0x30
>>> [<f8a58599>] ath5k_rxbuf_setup+0x39/0x200 [ath5k]
>>> [<f8a5a697>] ath5k_tasklet_rx+0x127/0x5c0 [ath5k]
>>> [<c014969a>] ? print_lock_contention_bug+0x1a/0xe0
>>> [<c012eafc>] tasklet_action+0x4c/0xc0
[...]
>>> =======================
>>> FIX kmalloc-4096: Restoring 0xf6f3a080-0xf6f3a0ef=0x6b
>>> Dave, could you please remind us which net driver was in use here?
>> There's ath5k in the stack trace but that, of course, doesn't
>> automatically mean it's at fault here. It could have been just the poor
>> bastard who was the next to allocate 4 KB with kmalloc() noticing the
>> corruption.
No, unfortunately ath5k *is* likely the culprit. Next time please Cc
ath5k-devel@...ts.ath5k.org even if it is only a suspicion.
> But I still have no idea with the poison overwritten.
Could you try patch from
http://lkml.org/lkml/2008/7/15/276
? (I have no idea how reproducible is this by you, it often happens on noisy
channels and/or by lowering RX buffers, i.e. ATH_RXBUF).
[It hit mainline few days ago, I'm going to fwd it to stable.]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists