lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211109220717.GA1187103@bhelgaas>
Date:   Tue, 9 Nov 2021 16:07:17 -0600
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Hans de Goede <hdegoede@...hat.com>
Cc:     "Rafael J . Wysocki" <rjw@...ysocki.net>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        Krzysztof Wilczyński <kw@...ux.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Myron Stowe <myron.stowe@...hat.com>,
        Juha-Pekka Heikkila <juhapekka.heikkila@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H . Peter Anvin" <hpa@...or.com>, linux-acpi@...r.kernel.org,
        linux-pci@...r.kernel.org, x86@...nel.org,
        linux-kernel@...r.kernel.org,
        Benoit Grégoire <benoitg@...us.ca>,
        Hui Wang <hui.wang@...onical.com>, stable@...r.kernel.org,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>
Subject: Re: [PATCH v5 1/2] x86/PCI: Ignore E820 reservations for bridge
 windows on newer systems

On Sat, Nov 06, 2021 at 11:15:07AM +0100, Hans de Goede wrote:
> On 10/20/21 23:14, Bjorn Helgaas wrote:
> > On Wed, Oct 20, 2021 at 12:23:26PM +0200, Hans de Goede wrote:
> >> On 10/19/21 23:52, Bjorn Helgaas wrote:
> >>> On Thu, Oct 14, 2021 at 08:39:42PM +0200, Hans de Goede wrote:
> >>>> Some BIOS-es contain a bug where they add addresses which map to system
> >>>> RAM in the PCI host bridge window returned by the ACPI _CRS method, see
> >>>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
> >>>> space").
> >>>>
> >>>> To work around this bug Linux excludes E820 reserved addresses when
> >>>> allocating addresses from the PCI host bridge window since 2010.
> >>>> ...
> > 
> >>> I haven't seen anybody else eager to merge this, so I guess I'll stick
> >>> my neck out here.
> >>>
> >>> I applied this to my for-linus branch for v5.15.
> >>
> >> Thank you, and sorry about the build-errors which the lkp
> >> kernel-test-robot found.
> >>
> >> I've just send out a patch which fixes these build-errors
> >> (verified with both .config-s from the lkp reports).
> >> Feel free to squash this into the original patch (or keep
> >> them separate, whatever works for you).
> > 
> > Thanks, I squashed the fix in.
> > 
> > HOWEVER, I think it would be fairly risky to push this into v5.15.
> > We would be relying on the assumption that current machines have all
> > fixed the BIOS defect that 4dc2287c1805 addressed, and we have little
> > evidence for that.
> >
> > I'm not sure there's significant benefit to having this in v5.15.
> > Yes, the mainline v5.15 kernel would work on the affected machines,
> > but I suspect most people with those machines are running distro
> > kernels, not mainline kernels.
> 
> I understand that you were reluctant to add this to 5.15 so close
> near the end of the 5.15 cycle, but can we please get this into
> 5.16 now ?
> 
> I know you ultimately want to see if there is a better fix,
> but this is hitting a *lot* of users right now and if we come up
> with a better fix we can always use that to replace this one
> later.

I don't know whether there's a "better" fix, but I do know that if we
merge what we have right now, nobody will be looking for a better
one.

We're in the middle of the merge window, so the v5.16 development
cycle is over.  The v5.17 cycle is just starting, so we have time to
hit that.  Obviously a fix can be backported to older kernels as
needed.

> So can we please just go with this fix now, so that we can
> fix the issues a lot of users are seeing caused by the current
> *wrong* behavior of taking the e820 reservations into account ?

I think the fix on the table is "ignore E820 for BIOS date >= 2018"
plus the obvious parameters to force it both ways.

The thing I don't like is that this isn't connected at all to the
actual BIOS defect.  We have no indication that current BIOSes have
fixed the defect, and we have no assurance that future ones will not
have the defect.  It would be better if we had some algorithmic way of
figuring out what to do.

Thank you very much for chasing down the dmesg log archive
(https://github.com/linuxhw/Dmesg; see
https://lore.kernel.org/r/82035130-d810-9f0b-259e-61280de1d81f@redhat.com).
Unfortunately I haven't had time to look through it myself, and I
haven't heard of anybody else doing it either.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ