[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250825153501.3a1d0f0c.alex.williamson@redhat.com>
Date: Mon, 25 Aug 2025 15:35:01 -0600
From: Alex Williamson <alex.williamson@...hat.com>
To: Farhan Ali <alifm@...ux.ibm.com>
Cc: linux-s390@...r.kernel.org, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, helgaas@...nel.org, schnelle@...ux.ibm.com,
mjrosato@...ux.ibm.com
Subject: Re: [PATCH v2 1/9] PCI: Avoid restoring error values in config
space
On Mon, 25 Aug 2025 10:12:18 -0700
Farhan Ali <alifm@...ux.ibm.com> wrote:
> The current reset process saves the device's config space state before
> reset and restores it afterward. However, when a device is in an error
> state before reset, config space reads may return error values instead of
> valid data. This results in saving corrupted values that get written back
> to the device during state restoration. Add validation to prevent writing
> error values to the device when restoring the config space state after
> reset.
>
> Signed-off-by: Farhan Ali <alifm@...ux.ibm.com>
> ---
> drivers/pci/pci.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b0f4d98036cd..0dd95d782022 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1825,6 +1825,9 @@ static void pci_restore_config_dword(struct pci_dev *pdev, int offset,
> if (!force && val == saved_val)
> return;
>
> + if (PCI_POSSIBLE_ERROR(saved_val))
> + return;
> +
> for (;;) {
> pci_dbg(pdev, "restore config %#04x: %#010x -> %#010x\n",
> offset, val, saved_val);
The commit log makes this sound like more than it is. We're really
only error checking the first 64 bytes of config space before restore,
the capabilities are not checked. I suppose skipping the BARs and
whatnot is no worse than writing -1 to them, but this is only a
complete solution in the narrow case where we're relying on vfio-pci to
come in and restore the pre-open device state.
I had imagined that pci_save_state() might detect the error state of
the device, avoid setting state_saved, but we'd still perform the
restore callouts that only rely on internal kernel state, maybe adding a
fallback to restore the BARs from resource information.
This implementation serves a purpose, but the commit log should
describe the specific, narrow scenario this solves, and probably also
add a comment in the code about why we're not consistently checking the
saved state for errors. Thanks,
Alex
Powered by blists - more mailing lists