[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276C09E743E99D92BB5C1B28C37A@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Thu, 13 Jul 2023 03:22:20 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jean-Philippe Brucker <jean-philippe@...aro.org>,
Baolu Lu <baolu.lu@...ux.intel.com>
CC: Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
"Robin Murphy" <robin.murphy@....com>,
Jason Gunthorpe <jgg@...pe.ca>,
Nicolin Chen <nicolinc@...dia.com>,
"Liu, Yi L" <yi.l.liu@...el.com>,
Jacob Pan <jacob.jun.pan@...ux.intel.com>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH 1/9] iommu: Move iommu fault data to linux/iommu.h
> From: Jean-Philippe Brucker <jean-philippe@...aro.org>
> Sent: Wednesday, July 12, 2023 5:34 PM
>
> On Wed, Jul 12, 2023 at 10:07:22AM +0800, Baolu Lu wrote:
> > > > +/**
> > > > + * struct iommu_fault_unrecoverable - Unrecoverable fault data
> > > > + * @reason: reason of the fault, from &enum iommu_fault_reason
> > > > + * @flags: parameters of this fault (IOMMU_FAULT_UNRECOV_*
> values)
> > > > + * @pasid: Process Address Space ID
> > > > + * @perm: requested permission access using by the incoming
> transaction
> > > > + * (IOMMU_FAULT_PERM_* values)
> > > > + * @addr: offending page address
> > > > + * @fetch_addr: address that caused a fetch abort, if any
> > > > + */
> > > > +struct iommu_fault_unrecoverable {
> > > > + __u32 reason;
> > > > +#define IOMMU_FAULT_UNRECOV_PASID_VALID (1 <<
> 0)
> > > > +#define IOMMU_FAULT_UNRECOV_ADDR_VALID (1 <<
> 1)
> > > > +#define IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID (1 <<
> 2)
> > > > + __u32 flags;
> > > > + __u32 pasid;
> > > > + __u32 perm;
> > > > + __u64 addr;
> > > > + __u64 fetch_addr;
> > > > +};
> > >
> > > Currently there is no handler for unrecoverable faults.
>
> Yes those were meant for guest injection. Another goal was to replace
> report_iommu_fault(), which also passes unrecoverable faults to host
> drivers. Three drivers use that API:
> * usnic just prints the error, which could be done by the IOMMU driver,
> * remoteproc attempts to recover from the crash,
> * msm attempts to handle the fault, or at least recover from the crash.
I was not aware of them. Thanks for pointing out.
>
> So the first one can be removed, and the others could move over to IOPF
> (which may need to indicate that the fault is not actually recoverable by
> the IOMMU) and return IOMMU_PAGE_RESP_INVALID.
Yep, presumably we should have just one interface to handle fault.
>
> > >
> > > Both Intel/ARM register iommu_queue_iopf() as the device fault handler.
> > > It returns -EOPNOTSUPP for unrecoverable faults.
> > >
> > > In your series the common iommu_handle_io_pgfault() also only works
> > > for PRQ.
> > >
> > > It kinds of suggest above definitions are dead code, though arm-smmu-v3
> > > does attempt to set them.
> > >
> > > Probably it's right time to remove them.
> > >
> > > In the future even if there might be a need of forwarding unrecoverable
> > > faults to the user via iommufd, fault reasons reported by the physical
> > > IOMMU doesn't make any sense to the guest.
>
> I guess it depends on the architecture? The SMMU driver can report only
> stage-1 faults through iommu_report_device_fault(), which are faults due
> to a guest misconfiguring the tables assigned to it. At the moment
> arm_smmu_handle_evt() only passes down stage-1 page table errors, the
> rest
> is printed by the host.
In that case the kernel just needs to notify the vIOMMU an error happened
along with access permissions (r/w/e/p). vIOMMU can figure out the reason
itself by walking the stage-1 page table. Likely it will find the same reason
as host reports, but that sounds a clearer path in concept.
>
> > > Presumably the vIOMMU
> > > should walk guest configurations to set a fault reason which makes sense
> > > from guest p.o.v.
> >
> > I am fine to remove unrecoverable faults data. But it was added by Jean,
> > so I'd like to know his opinion on this.
>
> Passing errors to the guest could be a useful diagnostics tool for
> debugging, once the guest gets more controls over the IOMMU hardware,
> but
> it doesn't have a purpose beyond that. It could be the only tool
> available, though: to avoid a guest voluntarily flooding the host logs by
> misconfiguring its tables, we may have to disable printing in the host
> errors that come from guest misconfiguration, in which case there won't be
> any diagnostics available for guest bugs.
>
> For now I don't mind if they're removed, if there is an easy way to
> reintroduce them later.
>
We can keep whatever is required to satisfy the kernel drivers which
want to know the fault.
But for anything invented for old uAPI (e.g. fault_reason) let's remove
them and redefine later when introducing the support to the user.
Powered by blists - more mailing lists