linux-kernel - Re: [PATCH v8 06/10] iommufd: Add iommufd fault object

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <309e37d1-6066-4ba1-b30c-402a3c3e7c76@linux.intel.com>
Date: Thu, 4 Jul 2024 10:59:45 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Nicolin Chen <nicolinc@...dia.com>
Cc: baolu.lu@...ux.intel.com, Jason Gunthorpe <jgg@...pe.ca>,
 Kevin Tian <kevin.tian@...el.com>, Joerg Roedel <joro@...tes.org>,
 Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
 Jean-Philippe Brucker <jean-philippe@...aro.org>, Yi Liu
 <yi.l.liu@...el.com>, Jacob Pan <jacob.jun.pan@...ux.intel.com>,
 Joel Granados <j.granados@...sung.com>, iommu@...ts.linux.dev,
 virtualization@...ts.linux-foundation.org, linux-kernel@...r.kernel.org,
 Jason Gunthorpe <jgg@...dia.com>
Subject: Re: [PATCH v8 06/10] iommufd: Add iommufd fault object

On 7/4/24 7:06 AM, Nicolin Chen wrote:
> Hi Baolu,

Hi Nicolin,

> On Tue, Jul 02, 2024 at 02:34:40PM +0800, Lu Baolu wrote:
> 
>> An iommufd fault object provides an interface for delivering I/O page
>> faults to user space. These objects are created and destroyed by user
>> space, and they can be associated with or dissociated from hardware page
>> table objects during page table allocation or destruction.
>>
>> User space interacts with the fault object through a file interface. This
>> interface offers a straightforward and efficient way for user space to
>> handle page faults. It allows user space to read fault messages
>> sequentially and respond to them by writing to the same file. The file
>> interface supports reading messages in poll mode, so it's recommended that
>> user space applications use io_uring to enhance read and write efficiency.
>>
>> A fault object can be associated with any iopf-capable iommufd_hw_pgtable
>> during the pgtable's allocation. All I/O page faults triggered by devices
>> when accessing the I/O addresses of an iommufd_hw_pgtable are routed
>> through the fault object to user space. Similarly, user space's responses
>> to these page faults are routed back to the iommu device driver through
>> the same fault object.
> There is a need for VIOMMU object to report HW fault to VMM. For
> example, a HW-accelerated VCMDQ may encounter HW errors. HW will
> raise an IRQ to the host kernel and the host kernel will forward
> it to the guest. I think we can have a viommu->fault, similar to
> the hwpt->fault introduced by this series. This viommu->fault can
> also benefit nested IOMMU for reporting translation error.
> 
> I learned that this hwpt->fault is exclusively for IOPF/PRI. And
> Jason suggested me to add a different one for VIOMMU. Yet, after
> taking a closer look, I found the fault object in this series is
> seemingly quite generic at the uAPI level: its naming/structure,
> and the way how it's allocated and passed to hwpt, despite being
> highly correlated with IOPF in its fops code. So, I feel that we
> might have a chance of reusing it for different fault types:
> 
> +enum iommu_fault_type {
> +	IOMMU_FAULT_TYPE_HWPT_IOPF,
> +	IOMMU_FAULT_TYPE_VIOMMU_IRQ,
> +};
> 
>   struct iommu_fault_alloc {
>   	__u32 size;
>   	__u32 flags;
> +	__u32 type;  /* enum iommu_fault_type */
>   	__u32 out_fault_id;
>   	__u32 out_fault_fd;
>   };
> 
> I understand that this is already v8. So, maybe we can, for now,
> apply the small diff above with an IOMMU_FAULT_TYPE_HWPT_IOPF type
> check in the ioctl handler. And a decoupling for the iopf fops in
> the ioctl handler can come later in the viommu series:
> 	switch (type) {
> 	case IOMMU_FAULT_TYPE_HWPT_IOPF:
> 		filep = anon_inode_getfile("[iommufd-pgfault]",
> 					   &iommufd_fault_fops_iopf);
> 	case IOMMU_FAULT_TYPE_VIOMMU_IRQ:
> 		filep = anon_inode_getfile("[iommufd-viommu-irq]",
> 					   &iommufd_fault_fops_viommu);
> 	default:
> 		return -EOPNOSUPP;
> 	}
> 
> Since you are the designer here, I think you have a better 10000
> foot view -- maybe I am missing something here implying that the
> fault object can't be really reused by viommu.
> 
> Would you mind sharing some thoughts here?

I think this is a choice between "two different objects" vs. "same
object with different FD interfaces". If I understand it correctly, your
proposal of unrecoverable fault delivery is not limited to vcmdq, but
generic to all unrecoverable events that userspace should be aware of
when the passed-through device is affected.

 From a hardware architecture perspective, the interfaces for
unrecoverable events don't always match the page faults. For example,
VT-d architecture defines a PR queue for page faults, but uses a
register set to report unrecoverable events. The 'reason', 'request id'
and 'pasid' fields of the register set indicate what happened on the
hardware. New unrecoverable events will not be reported until the
previous one has been fetched.

With the above being said, I have no strong opinions between these two
choices. Jason and Kevin should have more insights.

Thanks,
baolu