[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <88289f74-3d4f-4dd9-8f2a-8871d150fd50@linux.ibm.com>
Date: Tue, 27 Jan 2026 13:44:11 -0800
From: Farhan Ali <alifm@...ux.ibm.com>
To: Niklas Schnelle <schnelle@...ux.ibm.com>, linux-s390@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org
Cc: helgaas@...nel.org, lukas@...ner.de, alex@...zbot.org, clg@...hat.com,
stable@...r.kernel.org, mjrosato@...ux.ibm.com, julianr@...ux.ibm.com
Subject: Re: [PATCH v8 7/9] vfio-pci/zdev: Add a device feature for error
information
On 1/27/2026 2:53 AM, Niklas Schnelle wrote:
> On Thu, 2026-01-22 at 11:44 -0800, Farhan Ali wrote:
>> For zPCI devices, we have platform specific error information. The platform
>> firmware provides this error information to the operating system in an
>> architecture specific mechanism. To enable recovery from userspace for
>> these devices, we want to expose this error information to userspace. Add a
>> new device feature to expose this information.
>>
>> Signed-off-by: Farhan Ali <alifm@...ux.ibm.com>
>> ---
>> drivers/vfio/pci/vfio_pci_core.c | 2 ++
>> drivers/vfio/pci/vfio_pci_priv.h | 9 ++++++++
>> drivers/vfio/pci/vfio_pci_zdev.c | 35 ++++++++++++++++++++++++++++++++
>> include/uapi/linux/vfio.h | 16 +++++++++++++++
>> 4 files changed, 62 insertions(+)
>>
> --- snip ---
>>
>> +int vfio_pci_zdev_feature_err(struct vfio_device *device, u32 flags,
>> + void __user *arg, size_t argsz)
>> +{
>> + struct vfio_device_feature_zpci_err err;
>> + struct vfio_pci_core_device *vdev;
>> + struct zpci_dev *zdev;
>> + int head = 0;
>> + int ret;
>> +
>> + vdev = container_of(device, struct vfio_pci_core_device, vdev);
>> + zdev = to_zpci(vdev->pdev);
>> + if (!zdev)
>> + return -ENODEV;
>> +
>> + ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_GET,
>> + sizeof(err));
>> + if (ret != 1)
>> + return ret;
>> +
>> + mutex_lock(&zdev->pending_errs_lock);
>> + if (zdev->pending_errs.count) {
>> + head = zdev->pending_errs.head % ZPCI_ERR_PENDING_MAX;
>> + err.pec = zdev->pending_errs.err[head].pec;
> In the previous patch you saved the entire struct zpci_ccdf_err now you
> only copy out and expose the PCI event code, though? If you do want to
> only expose that the commit message should state this and the reason
> for this restriction. Additionally I think the struct
> vfio_device_feature_zpci_err should include a mechanism (version +
> size?) to allow upgrading it to the full error information in the
> future.
I think having explicit version variable in the struct should be
sufficient (the __pad can be replaced with version). I don't think we
need explicit size variable? I have looked at some of the capability
structures in vfio_zdev.h, as examples and so we could use a similar
approach here when we need to extend the vfio_device_feature_zpci_err?
Though I don't see any other vfio device feature structure being
explicitly versioned. I am open to any guidance/suggestions on the best
practices on how to we version VFIO device feature structs.
>
> Then again why not just expose the entire CCDF? It's an architected
> data structure without and if you add it at the end of struct
> vfio_device_feature_zpci_err and add a size you should even be able to
> handle if it ever needs to grow. Of course you'd have to create a copy
> of the struct to use the the uAPI types so I'd probably also add a
> BUILD_BUG_ON() check on matching size. Or am I missing a reason to keep
> just the PEC?
I wanted to keep the information exposed to userspace minimal. The CCDF
exposes far more information and may not be needed by userspace/VM.
Today the PEC is sufficient for user space(QEMU) to take bubble up to a
VM. I also wanted to avoid having a copy of the struct in 2 places.
Thanks
Farhan
>> + zdev->pending_errs.head++;
>> + zdev->pending_errs.count--;
>> + err.pending_errors = zdev->pending_errs.count;
>> + }
>> + mutex_unlock(&zdev->pending_errs_lock);
>> +
>> + if (copy_to_user(arg, &err, sizeof(err)))
>> + return -EFAULT;
>> +
>> + return 0;
>> +}
>> +
Powered by blists - more mailing lists