linux-kernel - Re: [Bug #11382] e1000e: 2.6.27-rc1 corrupts EEPROM/NVM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200809251156.10648.jbarnes@virtuousgeek.org>
Date:	Thu, 25 Sep 2008 11:56:08 -0700
From:	Jesse Barnes <jbarnes@...tuousgeek.org>
To:	Jiri Kosina <jkosina@...e.cz>
Cc:	Frans Pop <elendil@...net.nl>, airlied@...il.com,
	Jeff Garzik <jeff@...zik.org>, davem@...emloft.net,
	Andrew Morton <akpm@...ux-foundation.org>,
	jeffrey.t.kirsher@...el.com, david.vrabel@....com, rjw@...k.pl,
	linux-kernel@...r.kernel.org, kernel-testers@...r.kernel.org,
	chrisl@...are.com, Ingo Molnar <mingo@...e.hu>,
	jesse.brandeburg@...il.com
Subject: Re: [Bug #11382] e1000e: 2.6.27-rc1 corrupts EEPROM/NVM

On Thursday, September 25, 2008 10:24 am Jiri Kosina wrote:
> On Thu, 25 Sep 2008, Frans Pop wrote:
> > Extra datapoint. As far as I've seen this problem has not yet been
> > reported by any people running Debian. This could point to X.Org as
> > Debian currently has 7.3 while I think the reports so far have been with
> > 7.4.
>
> Yes, I think that xorg/xorg i915 driver/libdrm/GEM/whatever are the
> biggest suspect currently, according to the data that has been gathered so
> far.

We have confirmation that this isn't GEM related; according to the Novell bug 
at https://bugzilla.novell.com/show_bug.cgi?id=425480 people have hit the 
problem with kernels w/o GEM.

That doesn't rule out i915 (though I don't think any changes have gone in 
since 2.6.26 that would have caused this) or xf86-video-intel.  It's possible 
that X is getting confused about BAR mappings somehow, resulting in a 
clobbered e1000e NVRAM, but why would the kernel version matter in that case?  
The only thing that comes to mind would be PAT...

Recent versions of the X drivers (using recent libpciaccess code) will try to 
map the resourceN_wc file in sysfs.  It's possible that the map size we end 
up using is wrong, leading to the situation Dave described earlier where we 
map too much MMIO space.

> Still, what confuses me a little bit -- the EEPROM of the card is set to
> all 0xff, once the corruption happens. Isn't that a quite a coincidence,
> that bytes representing "nothing" in this context are used?

Presumably one has to write all ones to the EEPROM BAR of the e1000 device to 
see that pattern?  Or is there some way of configuring the EEPROM such that 
it'll fail to respond to read cycles resulting in all ones for every read 
back (i.e. target abort)?

Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/