lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220502203205.GA349835@bhelgaas>
Date:   Mon, 2 May 2022 15:32:05 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Hans de Goede <hdegoede@...hat.com>
Cc:     "Rafael J . Wysocki" <rjw@...ysocki.net>,
        Borislav Petkov <bp@...en8.de>,
        "H . Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        Krzysztof Wilczyński <kw@...ux.com>,
        Myron Stowe <myron.stowe@...hat.com>,
        Juha-Pekka Heikkila <juhapekka.heikkila@...il.com>,
        Benoit Grégoire <benoitg@...us.ca>,
        Hui Wang <hui.wang@...onical.com>,
        Kai-Heng Feng <kai.heng.feng@...onical.com>,
        linux-acpi@...r.kernel.org, linux-pci@...r.kernel.org,
        x86@...nel.org, linux-kernel@...r.kernel.org,
        Bjorn Helgaas <bhelgaas@...gle.com>
Subject: Re: [PATCH v2 0/3] x86/PCI: Log E820 clipping

On Mon, May 02, 2022 at 02:24:26PM +0200, Hans de Goede wrote:
> On 4/19/22 18:45, Bjorn Helgaas wrote:
> > On Tue, Apr 19, 2022 at 05:16:44PM +0200, Hans de Goede wrote:
> >> On 4/19/22 17:03, Bjorn Helgaas wrote:
> >>> On Tue, Apr 19, 2022 at 11:59:17AM +0200, Hans de Goede wrote:

> >>>> So what is the plan to actually fix the issue seen on some
> >>>> Lenovo models and Clevo Barebones ?   As I mentioned previously
> >>>> I think that since all our efforts have failed so far that we
> >>>> should maybe reconsider just using DMI quirks to ignore the
> >>>> E820 reservation windows for host bridges on affected models ?
> >>>
> >>> I have been resisting DMI quirks but I'm afraid there's no other
> >>> way.
> >>
> >> Well there is the first match adjacent windows returned by _CRS
> >> and only then do the "covers whole region" exception check. I
> >> still think that would work at least for the chromebook
> >> regression...
> > 
> > Without a crystal clear strategy, I think we're going to be
> > tweaking the algorithm forever as the _CRS/E820 mix changes.
> > That's why I think that in the long term, a "use _CRS only, with
> > quirks for exceptions" strategy will be simplest.
> 
> Looking at the amount of exception we already now about I'm not sure
> if that will work well.

It's possible that many quirks will be required.  But I think in the
long run the value of the simplest, most obvious strategy is huge.
It's laid out in the spec already and it's the clearest way to
agreement between firmware and OS.  When we trip over something, it's
very easy to determine whether _CRS is wrong or Linux is using it
wrong.  If we have to bring in question of looking at E820 entries,
possibly merging them, using them or not based on overlaps ... that's
a much more difficult conversation without a clear resolution.

> > So I think we should go ahead with DMI quirks instead of trying to
> > make the algorithm smarter, and yes, I think we will need commandline
> > arguments, probably one to force E820 clipping for future machines,
> > and one to disable it for old machines.
> 
> So what you are suggesting is to go back to a bios-date based approach
> (to determine old vs new machines) combined with DMI quirks to force
> E820 clipping on new machines which turn out to need it despite them
> being new ?

Yes.  It's ugly but I think the 10-year outlook is better.

> I have the feeling that if we switch to top-down allocating
> that we can then switch to just using _CRS and that everything
> will then just work, because we then match what Windows is doing...

Yes, it might.  But I'm not 100% comfortable because it basically
sweeps _CRS bugs under the rug, and we may trip over them as we do
more hotplug and (eventually) resource rebalancing.  I think we need
to work toward getting _CRS more reliable.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ