linux-kernel - Re: e1000e NVM corruption issue status

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <48DD2F98.8070509@tpi.com>
Date:	Fri, 26 Sep 2008 12:53:12 -0600
From:	Tim Gardner <timg@....com>
To:	Jesse Barnes <jbarnes@...tuousgeek.org>
CC:	Arjan van de Ven <arjan@...ux.intel.com>,
	Jiri Kosina <jkosina@...e.cz>,
	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
	LKML <linux-kernel@...r.kernel.org>, agospoda@...hat.com,
	"Ronciak, John" <john.ronciak@...el.com>,
	"Allan, Bruce W" <bruce.w.allan@...el.com>,
	"Graham, David" <david.graham@...el.com>, kkiel@...e.de,
	tglx@...utronix.de, chris.jones@...onical.com,
	arjan@...ux.jf.intel.com
Subject: Re: e1000e NVM corruption issue status

Jesse Barnes wrote:
> On Friday, September 26, 2008 10:52 am Jesse Barnes wrote:
>> On Friday, September 26, 2008 4:49 am Arjan van de Ven wrote:
>>> Jiri Kosina wrote:
>>>> On Thu, 25 Sep 2008, Brandeburg, Jesse wrote:
>>>>> this is the current set of patches that I have to help us debug
>>>>> and/or fix e1000e issues found during this debug effort for
>>>>> the corrupt NVM.  the "drop stats lock" - "reset swflag" patches allow
>>>>> Thomas' patch for a mutex in the SWFLAG acquire function to run
>>>>> without any errors.
>>>> Thanks. Also Jesse Barnes' patch shouldn't be forgotten, could you
>>>> please add it to that lineup?
>>>>
>>>> 	http://marc.info/?l=linux-kernel&m=122237193628087&w=2
>>> can we (for now) also stick a WARN_ON() into that failure path? that way
>>> we can at least catch if/when this happens more visibly..... if it
>>> happens consistently in say the new distros we can be more confident that
>>> we're down the right path in diagnosing the issue.
>> I'm spinning a new one now with some debug output, stay tuned (just gotta
>> boot my test box).
> 
> Ok here's an updated one.  Jesse (Br) can you add it to your list?  If the X 
> driver really is mapping too much this should catch it, as long as it goes 
> through sysfs.
> 
> Thanks,
> Jesse
> 

I've been experimenting with unmapping flash space until its actually
needed, e.g., in the functions that use the E1000_READ_FLASH and
E1000_WRITE_FLASH macros. Along the way I looked at how flash write
cycles are initiated because I was having a hard time believing that
having flash space mapped was part of the root cause. However, it looks
like its pretty simple to initiate a write or erase cycle. All of the
required action bits in ICH_FLASH_HSFSTS and ICH_FLASH_HSFCTL must be 1,
and these 2 register are in the correct order if X was writing 0xff in
ascending order.

Just a thought.

rtg
-- 
Tim Gardner timg@....com www.tpi.com
OR 503-601-0234 x102 MT 406-443-5357
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/