[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220615151100.GA937185@bhelgaas>
Date: Wed, 15 Jun 2022 10:11:00 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Keith Busch <kbusch@...nel.org>
Cc: "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
Hans de Goede <hdegoede@...hat.com>,
"Rafael J . Wysocki" <rafael@...nel.org>,
Mika Westerberg <mika.westerberg@...ux.intel.com>,
Krzysztof Wilczyński <kw@...ux.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Myron Stowe <myron.stowe@...hat.com>,
Juha-Pekka Heikkila <juhapekka.heikkila@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H . Peter Anvin" <hpa@...or.com>,
Benoit Grégoire <benoitg@...us.ca>,
Hui Wang <hui.wang@...onical.com>, linux-acpi@...r.kernel.org,
linux-pci@...r.kernel.org, x86@...nel.org,
linux-kernel@...r.kernel.org, Jens Axboe <axboe@...com>,
Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>,
linux-nvme@...ts.infradead.org
Subject: Re: [PATCH] x86/PCI: Revert: "Clip only host bridge windows for E820
regions"
On Tue, Jun 14, 2022 at 04:47:35PM -0700, Keith Busch wrote:
> On Tue, Jun 14, 2022 at 06:01:28PM -0500, Bjorn Helgaas wrote:
> > [+cc NVMe folks]
> >
> > On Tue, Jun 14, 2022 at 07:49:27PM -0300, Guilherme G. Piccoli wrote:
> > > On 14/06/2022 12:47, Hans de Goede wrote:
> > > > [...]
> > > >
> > > > Have you looked at the log of the failed boot in the Steam Deck kernel
> > > > bugzilla? Everything there seems to work just fine and then the system
> > > > just hangs. I think that maybe it cannot find its root disk, so maybe
> > > > an NVME issue ?
> > >
> > > *Exactly* that - NVMe device is the root disk, it cannot boot since the
> > > device doesn't work, hence no rootfs =)
> >
> > Beginning of thread: https://lore.kernel.org/r/20220612144325.85366-1-hdegoede@redhat.com
> >
> > Steam Deck broke because we erroneously trimmed out the PCI host
> > bridge window where BIOS had placed most devices, successfully
> > reassigned all the PCI bridge windows and BARs, but some devices,
> > apparently including NVMe, didn't work at the new addresses.
> >
> > Do you NVMe folks know of gotchas in this area? I want to know
> > because we'd like to be able to move devices around someday to
> > make room for hot-added devices.
> >
> > This reassignment happened before drivers claimed the devices, so
> > from a PCI point of view, I don't know why the NVMe device
> > wouldn't work at the new address.
>
> The probe status quickly returns ENODEV. Based on the output (we
> don't log much, so this is just an educated guesss), I think that
> means the driver read all F's from the status register, which
> indicates we can't read it when using the reassigned memory window.
>
> Why changing memory windows may not work tends to be platform or
> device specific. Considering the renumbered windows didn't cause a
> problem for other devices, it sounds like this nvme device may be
> broken.
It sounds like you've seen this sort of problem before, so we
shouldn't assume that it's safe to reassign BARs.
I think Windows supports rebalancing, but it does look like drivers
have the ability to veto it:
https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/stopping-a-device-to-rebalance-resources
https://docs.microsoft.com/en-us/windows-hardware/drivers/wdf/the-pnp-manager-redistributes-system-resources
So I suppose if/when we support rebalancing, it'll have to be an
opt-in thing for each driver.
Powered by blists - more mailing lists