[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120206175702.3a41ffc4@mj>
Date: Mon, 6 Feb 2012 17:57:02 -0500
From: Pavel Roskin <proski@....org>
To: "Carlos R. Mafra" <crmafra@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
"Luis R. Rodriguez" <mcgrof@....qualcomm.com>,
ath9k-devel@...ts.ath9k.org
Subject: Re: [ath9k-devel] [3.3-rc2+] Thousands of ath9k warnings on dmesg
before laptop froze
On Mon, 6 Feb 2012 00:29:07 +0000
"Carlos R. Mafra" <crmafra@...il.com> wrote:
>
> I'm testing the latest kernel 3.3.0-rc2+ I pulled from git
> this morning.
>
> My laptop just froze, and when I rebooted I noticed
> that /var/log/messages contained 48 thousand (!) warnings coming from
> ath9k since a few hours ago. I'm pasting the first one:
>
> ------------[ cut here ]------------
> WARNING:
> at /home/mafra/linux-2.6/drivers/net/wireless/ath/ath9k/rc.c:697
> ath_rc_get_highest_rix+0x156/0x210 [ath9k]() Hardware name: VPCEB4X1E
I believe I found a solution for this today. Please see this bug
tracker: https://bugzilla.redhat.com/show_bug.cgi?id=768639
While Fedora users report a warning, I've seen panic reports in the
list. It's a memory corruption bug, so it can manifest in different
ways. Please test the latest patch (attached).
Here's my comment to the patch:
This patch is based on my analysis of printk() output I added to the
ath9k driver. I didn't have a chance to test the patch, so testing
would be greatly appreciated.
The corruption must be happening in ath_debug_stat_rc(), which is given
the result of ath_rc_get_rateindex(). ath_rc_get_rateindex() can
return -1, which causes ath_debug_stat_rc() to increment the value that
lies 16 bytes before rcstats in struct ath_rate_priv. On 64-bit
systems, that happens to be rate_table. Once the rate_table pointer is
incremented, all data there becomes invalid, which leads to the
warning. On 32-bit systems, the corruption should happen in
neg_ht_rates.
The -1 value of idx in struct ieee80211_tx_rate is described in
net/mac80211.h. I don't know why we have -1 there and how to reproduce
the problem reliably. But -1 can be there and ath9k has no checks for
it.
The patch introduces two protections: ath_rc_get_rateindex() never
returns a negative value and ath_debug_stat_rc() checks the array
bounds.
It may not be good enough for the kernel, but it may be good enough for
Fedora.
--
Regards,
Pavel Roskin
View attachment "01-rix-check.patch" of type "text/x-patch" (1281 bytes)
Powered by blists - more mailing lists