[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211215160145.GA695366@bhelgaas>
Date: Wed, 15 Dec 2021 10:01:45 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Hans de Goede <hdegoede@...hat.com>
Cc: "Rafael J . Wysocki" <rjw@...ysocki.net>,
Mika Westerberg <mika.westerberg@...ux.intel.com>,
Krzysztof Wilczyński <kw@...ux.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Myron Stowe <myron.stowe@...hat.com>,
Juha-Pekka Heikkila <juhapekka.heikkila@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H . Peter Anvin" <hpa@...or.com>, linux-acpi@...r.kernel.org,
linux-pci@...r.kernel.org, x86@...nel.org,
linux-kernel@...r.kernel.org,
Benoit Grégoire <benoitg@...us.ca>,
Hui Wang <hui.wang@...onical.com>, stable@...r.kernel.org,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>
Subject: Re: [PATCH v5 1/2] x86/PCI: Ignore E820 reservations for bridge
windows on newer systems
On Tue, Dec 07, 2021 at 05:52:40PM +0100, Hans de Goede wrote:
> On 11/10/21 14:05, Hans de Goede wrote:
> > On 11/10/21 09:45, Hans de Goede wrote:
> >> On 11/9/21 23:07, Bjorn Helgaas wrote:
> >>> On Sat, Nov 06, 2021 at 11:15:07AM +0100, Hans de Goede wrote:
> >>>> On 10/20/21 23:14, Bjorn Helgaas wrote:
> >>>>> On Wed, Oct 20, 2021 at 12:23:26PM +0200, Hans de Goede wrote:
> >>>>>> On 10/19/21 23:52, Bjorn Helgaas wrote:
> >>>>>>> On Thu, Oct 14, 2021 at 08:39:42PM +0200, Hans de Goede wrote:
> >>>>>>>> Some BIOS-es contain a bug where they add addresses which map to system
> >>>>>>>> RAM in the PCI host bridge window returned by the ACPI _CRS method, see
> >>>>>>>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
> >>>>>>>> space").
> >>>>>>>>
> >>>>>>>> To work around this bug Linux excludes E820 reserved addresses when
> >>>>>>>> allocating addresses from the PCI host bridge window since 2010.
> >>>>>>>> ...
> >>>>>
> >>>>>>> I haven't seen anybody else eager to merge this, so I guess I'll stick
> >>>>>>> my neck out here.
> >>>>>>>
> >>>>>>> I applied this to my for-linus branch for v5.15.
> >>>>>>
> >>>>>> Thank you, and sorry about the build-errors which the lkp
> >>>>>> kernel-test-robot found.
> >>>>>>
> >>>>>> I've just send out a patch which fixes these build-errors
> >>>>>> (verified with both .config-s from the lkp reports).
> >>>>>> Feel free to squash this into the original patch (or keep
> >>>>>> them separate, whatever works for you).
> >>>>>
> >>>>> Thanks, I squashed the fix in.
> >>>>>
> >>>>> HOWEVER, I think it would be fairly risky to push this into v5.15.
> >>>>> We would be relying on the assumption that current machines have all
> >>>>> fixed the BIOS defect that 4dc2287c1805 addressed, and we have little
> >>>>> evidence for that.
> >>>>>
> >>>>> I'm not sure there's significant benefit to having this in v5.15.
> >>>>> Yes, the mainline v5.15 kernel would work on the affected machines,
> >>>>> but I suspect most people with those machines are running distro
> >>>>> kernels, not mainline kernels.
> >>>>
> >>>> I understand that you were reluctant to add this to 5.15 so close
> >>>> near the end of the 5.15 cycle, but can we please get this into
> >>>> 5.16 now ?
> >>>>
> >>>> I know you ultimately want to see if there is a better fix,
> >>>> but this is hitting a *lot* of users right now and if we come up
> >>>> with a better fix we can always use that to replace this one
> >>>> later.
> >>>
> >>> I don't know whether there's a "better" fix, but I do know that if we
> >>> merge what we have right now, nobody will be looking for a better
> >>> one.
> >>>
> >>> We're in the middle of the merge window, so the v5.16 development
> >>> cycle is over. The v5.17 cycle is just starting, so we have time to
> >>> hit that. Obviously a fix can be backported to older kernels as
> >>> needed.
> >>>
> >>>> So can we please just go with this fix now, so that we can
> >>>> fix the issues a lot of users are seeing caused by the current
> >>>> *wrong* behavior of taking the e820 reservations into account ?
> >>>
> >>> I think the fix on the table is "ignore E820 for BIOS date >= 2018"
> >>> plus the obvious parameters to force it both ways.
> >>
> >> Correct.
> >>
> >>> The thing I don't like is that this isn't connected at all to the
> >>> actual BIOS defect. We have no indication that current BIOSes have
> >>> fixed the defect,
> >>
> >> We also have no indication that that defect from 10 years ago, from
> >> pre UEFI firmware is still present in modern day UEFI firmware which
> >> is basically an entire different code-base.
> >>
> >> And even 10 years ago the problem was only happening to a single
> >> family of laptop models (Dell Precision laptops) so this clearly
> >> was a bug in that specific implementation and not some generic
> >> issue which is likely to be carried forward.
> >>
> >>> and we have no assurance that future ones will not
> >>> have the defect. It would be better if we had some algorithmic way of
> >>> figuring out what to do.
> >>
> >> You yourself have said that in hindsight taking E820 reservations
> >> into account for PCI bridge host windows was a mistake. So what
> >> the "ignore E820 for BIOS date >= 2018" is doing is letting the
> >> past be the past (without regressing on older models) while fixing
> >> that mistake on any hardware going forward.
> >>
> >> In the unlikely case that we hit that BIOS bug again on 1 or 2 models,
> >> we can simply DMI quirk those models, as we do for countless other
> >> BIOS issues.
> >>
> >>> Thank you very much for chasing down the dmesg log archive
> >>> (https://github.com/linuxhw/Dmesg; see
> >>> https://lore.kernel.org/r/82035130-d810-9f0b-259e-61280de1d81f@redhat.com).
> >>> Unfortunately I haven't had time to look through it myself, and I
> >>> haven't heard of anybody else doing it either.
> >>
> >> Right, I'm afraid that I already have spend way too much time on this
> >> myself. Note that I've been working with users on this bug on and off
> >> for over a year now.
> >>
> >> This is hitting many users and now that we have a viable fix, this
> >> really needs to be fixed now.
> >>
> >> I believe that the "ignore E820 for BIOS date >= 2018" fix is good
> >> enough and that you are letting perfect be the enemy of good here.
> >>
> >> As an upstream kernel maintainer myself, I'm sorry to say this,
> >> but if we don't get some fix for this merged soon you are leaving
> >> my no choice but to add my fix to the Fedora kernels as a downstream
> >> patch (and to advise other distros to do the same).
> >>
> >> Note that if you are still afraid of regressions going the downstream
> >> route is also an opportunity, Fedora will start testing moving users
> >> to 5.15.y soon, so I could add the patch to Fedora's 5.15.y builds and
> >> see how that goes ?
> >
> > So I've discussed this with the Fedora kernel maintainers and they have
> > agreed to add the patch to the Fedora 5.15 kernels, which we will ask
> > our users to start testing soon (we first run some voluntary testing
> > before eventually moving all users over).
> >
> > This will provide us with valuable feedback wrt this patch causing
> > regressions as you are worried about, or not.
> >
> > Assuming no regressions show up I hope that this will give you
> > some assurance that there the patch causes no regressions and that
> > you will then be willing to pick this up later during the 5.16
> > cycle so that Fedora only deviates from upstream for 1 cycle.
>
> 5.15.y kernels with this patch added have been in Fedora's
> stable updates repo for a while now without any reports of the
> regressions you feared this may cause.
>
> Bjorn, I hope that you are willing to merge this patch now that it has
> seen some more wide spread testing ?
I'm still not happy about the idea of basing this on BIOS dates. I
did this with 7bc5e3f2be32 ("x86/PCI: use host bridge _CRS info by
default on 2008 and newer machines"), and it was a mistake.
Because of that mistake, we now have the use_crs/nocrs kernel
parameters, which confuse users and lead to them being passed around
as "fixes" on random bulletin boards.
Adding another BIOS date check and use_e820/no_e820 kernel parameters
feels like it's layering on more complexity to cover up another major
mistake I made, 4dc2287c1805 ("x86: avoid E820 regions when allocating
address space").
I think it would be better for the code to recognize the situation
addressed by 4dc2287c1805 and deal with it directly. Is that
possible? I dunno; I don't think we've really tried.
Bjorn
Powered by blists - more mailing lists