linux-kernel - Re: [Bug #11382] e1000e: 2.6.27-rc1 corrupts EEPROM/NVM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <21d7e9970809240159u6db747eex51892061846b2251@mail.gmail.com>
Date:	Wed, 24 Sep 2008 18:59:34 +1000
From:	"Dave Airlie" <airlied@...il.com>
To:	"David Miller" <davem@...emloft.net>
Cc:	jkosina@...e.cz, jeffrey.t.kirsher@...el.com, david.vrabel@....com,
	rjw@...k.pl, linux-kernel@...r.kernel.org,
	kernel-testers@...r.kernel.org, chrisl@...are.com
Subject: Re: [Bug #11382] e1000e: 2.6.27-rc1 corrupts EEPROM/NVM

On Wed, Sep 24, 2008 at 5:36 PM, David Miller <davem@...emloft.net> wrote:
> From: "Dave Airlie" <airlied@...il.com>
> Date: Wed, 24 Sep 2008 15:45:46 +1000
>
>> I'm still dubious about this, wouldn't we see other wierdass side
>> effects if X was trashing the BARs on other devices?
>
> Sure.  My theory is that it's a recent xorg change causing this,
> so I've been going through GIT history for xserver, libpciaccess,
> and the intel driver for the past year looking for clues.
>
> If there is usually a gap after the video device, there would just
> be no response from the PCI bus, and the way that's handled is
> chipset specific.  At least a while back, most x86 systems would
> silently ignore writes and return all 1's in such a case, but
> they may be generating bus error events these days.  I simply don't
> know.

The only thing I can think off then is either the pciaccess conversion
of the intel Xorg driver,
or maybe something going wrong since PAT support was added.

>
>> I think tglx is on the right path, same problem as e1000, code is
>> stupid, it can reenter the nvram read/write code from irq
>> context, and pwn itself.
>
> The e1000e side here is reproducable way too easily for it to be the
> same case, as far as I see it.
>
> The e1000 driver has probably had this problem for years and we've
> only recently had some concrete cases of it triggering.
>
> Also, what utility are you running on your system that is even
> accessing the NVRAM on the e1000e card?  Knowing that might help
> us understand why this problem has appeared now.  Maybe there is
> some diagnostic or monitoring tool that is now becoming prevalent
> in these distributions where it triggers.

The driver seems quite happy to access the NVRAM, I think Thomas has
some backtraces that show
it clearly doing silly reentrant things...

>
> This problem started happening seemingly "all of a sudden", even to
> people who have been keeping sort-of recent with their kernels, such
> as yourself.
>
> Yet we can't get any sense yet what range of kernel versions are in
> use when the problem triggers.

I've seen it reported at least at 2.6.27-rc1 and maybe even one of
Fedora's -rc0 kernels.

Dave.

>
> I'm about to leave for a week or so in Paris for the netfilter
> workshop, so I hope that someone other than myself will do some data
> mining like I have instead of (merely) tossing theories around and
> finger pointing.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/