[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eb6d05d0-b448-4f4e-a734-50c56078dd9b@linux.ibm.com>
Date: Mon, 25 Aug 2025 15:13:00 -0700
From: Farhan Ali <alifm@...ux.ibm.com>
To: Alex Williamson <alex.williamson@...hat.com>
Cc: linux-s390@...r.kernel.org, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, helgaas@...nel.org,
schnelle@...ux.ibm.com, mjrosato@...ux.ibm.com
Subject: Re: [PATCH v2 1/9] PCI: Avoid restoring error values in config space
On 8/25/2025 2:35 PM, Alex Williamson wrote:
> On Mon, 25 Aug 2025 10:12:18 -0700
> Farhan Ali <alifm@...ux.ibm.com> wrote:
>
>> The current reset process saves the device's config space state before
>> reset and restores it afterward. However, when a device is in an error
>> state before reset, config space reads may return error values instead of
>> valid data. This results in saving corrupted values that get written back
>> to the device during state restoration. Add validation to prevent writing
>> error values to the device when restoring the config space state after
>> reset.
>>
>> Signed-off-by: Farhan Ali <alifm@...ux.ibm.com>
>> ---
>> drivers/pci/pci.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index b0f4d98036cd..0dd95d782022 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1825,6 +1825,9 @@ static void pci_restore_config_dword(struct pci_dev *pdev, int offset,
>> if (!force && val == saved_val)
>> return;
>>
>> + if (PCI_POSSIBLE_ERROR(saved_val))
>> + return;
>> +
>> for (;;) {
>> pci_dbg(pdev, "restore config %#04x: %#010x -> %#010x\n",
>> offset, val, saved_val);
>
> The commit log makes this sound like more than it is. We're really
> only error checking the first 64 bytes of config space before restore,
> the capabilities are not checked. I suppose skipping the BARs and
> whatnot is no worse than writing -1 to them, but this is only a
> complete solution in the narrow case where we're relying on vfio-pci to
> come in and restore the pre-open device state.
>
> I had imagined that pci_save_state() might detect the error state of
> the device, avoid setting state_saved, but we'd still perform the
> restore callouts that only rely on internal kernel state, maybe adding a
> fallback to restore the BARs from resource information.
I initially started with pci_save_state(), and avoid saving the state
altogether. But that would mean we don't go restore the msix state and
for s390 don't call arch_restore_msi_irqs(). Do you prefer to avoid
saving the state at all? This change was small and sufficient enough to
avoid breaking the device in my testing.
>
> This implementation serves a purpose, but the commit log should
> describe the specific, narrow scenario this solves, and probably also
> add a comment in the code about why we're not consistently checking the
> saved state for errors. Thanks,
>
> Alex
Yes, I can re-word the commit message.
Thanks
Farhan
Powered by blists - more mailing lists