[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1283888337.18468.9.camel@pjaxe>
Date: Tue, 07 Sep 2010 12:38:57 -0700
From: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@...el.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: Andi Kleen <andi@...stfloor.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"hpa@...or.com" <hpa@...or.com>, "x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH] [arch-x86] Allow SRAT integrity check to be skipped
On Thu, 2010-09-02 at 23:39 -0700, Ingo Molnar wrote:
> * Andi Kleen <andi@...stfloor.org> wrote:
>
> > > This isnt a particularly useful solution to users of said systems -
> > > they have to figure out that this option exists, and then they have
> > > to enter this option on the boot line.
> >
> > This usually only happens in early preproduction systems. So far the
> > BIOS always got fixed before they shipped to users.
>
> 'Usually' != 'always'. Read the changelog:
>
> ' There are BIOSes in production that have these failures, so this will
> allow people in the field to work around these BIOS issues. '
>
> Peter, which system in production that has this problem? That one needs
> a DMI match.
It's one SKU of a Nehalem-EX system. The BIOS for that SKU has an issue
with resolving SRAT hotplug enumeration, and screws up the table. Other
SKU's of this same platform do not have the issue. Efforts are underway
to get this BIOS fixed, but in the meantime, there's nothing for users
to work around the bug (aside from disabling memory hotplug in the
BIOS). Another platform almost shipped with the same symptoms, but
caught it and had it fixed before it shipped (didn't catch it early
because Windows wasn't failing, and most of the testing on that platform
was done under Windows).
I agree with Andi that adding DMI strings would be overkill and would
leave clutter once the BIOS is fixed. I look at this patch as a
stop-gap measure for people to fall back on until a newer BIOS is
available to correct the NUMA enumeration issues. Without it, we have
nothing to point users to when they run into this, waiting for a new
BIOS.
Cheers,
-PJ
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists