[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB52763FAD3E7545CC26C0DE908C409@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Thu, 8 Sep 2022 09:30:57 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>,
Robin Murphy <robin.murphy@....com>,
Joerg Roedel <joro@...tes.org>
CC: Nicolin Chen <nicolinc@...dia.com>,
"will@...nel.org" <will@...nel.org>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"suravee.suthikulpanit@....com" <suravee.suthikulpanit@....com>,
"marcan@...can.st" <marcan@...can.st>,
"sven@...npeter.dev" <sven@...npeter.dev>,
"alyssa@...enzweig.io" <alyssa@...enzweig.io>,
"robdclark@...il.com" <robdclark@...il.com>,
"dwmw2@...radead.org" <dwmw2@...radead.org>,
"baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
"mjrosato@...ux.ibm.com" <mjrosato@...ux.ibm.com>,
"gerald.schaefer@...ux.ibm.com" <gerald.schaefer@...ux.ibm.com>,
"orsonzhai@...il.com" <orsonzhai@...il.com>,
"baolin.wang@...ux.alibaba.com" <baolin.wang@...ux.alibaba.com>,
"zhang.lyra@...il.com" <zhang.lyra@...il.com>,
"thierry.reding@...il.com" <thierry.reding@...il.com>,
"vdumpa@...dia.com" <vdumpa@...dia.com>,
"jonathanh@...dia.com" <jonathanh@...dia.com>,
"jean-philippe@...aro.org" <jean-philippe@...aro.org>,
"cohuck@...hat.com" <cohuck@...hat.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"shameerali.kolothum.thodi@...wei.com"
<shameerali.kolothum.thodi@...wei.com>,
"thunder.leizhen@...wei.com" <thunder.leizhen@...wei.com>,
"christophe.jaillet@...adoo.fr" <christophe.jaillet@...adoo.fr>,
"yangyingliang@...wei.com" <yangyingliang@...wei.com>,
"jon@...id-run.com" <jon@...id-run.com>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"asahi@...ts.linux.dev" <asahi@...ts.linux.dev>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-arm-msm@...r.kernel.org" <linux-arm-msm@...r.kernel.org>,
"linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: RE: [PATCH v6 1/5] iommu: Return -EMEDIUMTYPE for incompatible domain
and device/group
> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Thursday, September 8, 2022 8:43 AM
>
> On Wed, Sep 07, 2022 at 08:41:13PM +0100, Robin Murphy wrote:
>
> > Again, not what I was suggesting. In fact the nature of
> iommu_attach_group()
> > already rules out bogus devices getting this far, so all a driver currently
> > has to worry about is compatibility of a device that it definitely probed
> > with a domain that it definitely allocated. Therefore, from a caller's point
> > of view, if attaching to an existing domain returns -EINVAL, try another
> > domain; multiple different existing domains can be tried, and may also
> > return -EINVAL for the same or different reasons; the final attempt is to
> > allocate a fresh domain and attach to that, which should always be
> nominally
> > valid and *never* return -EINVAL. If any attempt returns any other error,
> > bail out down the usual "this should have worked but something went
> wrong"
> > path. Even if any driver did have a nonsensical "nothing went wrong, I just
> > can't attach my device to any of my domains" case, I don't think it would
> > really need distinguishing from any other general error anyway.
>
> The algorithm you described is exactly what this series does, it just
> used EMEDIUMTYPE instead of EINVAL. Changing it to EINVAL is not a
> fundamental problem, just a bit more work.
>
> Looking at Nicolin's series there is a bunch of existing errnos that
> would still need converting, ie EXDEV, EBUSY, EOPNOTSUPP, EFAULT, and
> ENXIO are all returned as codes for 'domain incompatible with device'
> in various drivers. So the patch would still look much the same, just
> changing them to EINVAL instead of EMEDIUMTYPE.
>
> That leaves the question of the remaining EINVAL's that Nicolin did
> not convert to EMEDIUMTYPE.
>
> eg in the AMD driver:
>
> if (!check_device(dev))
> return -EINVAL;
>
> iommu = rlookup_amd_iommu(dev);
> if (!iommu)
> return -EINVAL;
>
> These are all cases of 'something is really wrong with the device or
> iommu, everything will fail'. Other drivers are using ENODEV for this
> already, so we'd probably have an additional patch changing various
> places like that to ENODEV.
>
> This mixture of error codes is the basic reason why a new code was
> used, because none of the existing codes are used with any
> consistency.
btw I saw the policy for -EBUSY is also not consistent in this series.
while it's correct to change -EBUSY to -EMEDIUMTYPE for omap, assuming
that retrying another fresh domain for the said device should work:
if (omap_domain->dev) {
- dev_err(dev, "iommu domain is already attached\n");
- ret = -EBUSY;
+ ret = -EMEDIUMTYPE;
goto out;
}
the change in tegra-gart doesn't sound correct:
if (gart->active_domain && gart->active_domain != domain) {
- ret = -EBUSY;
+ ret = -EMEDIUMTYPE;
one device cannot be attached to two domains. This fact doesn't change
no matter how many domains are tried. In concept this check is
redundant and should have been done by iommu core, but obviously we
didn't pay attention to what -EBUSY actually represents in this path.
>
> But OK, I'm on board, lets use more common errnos with specific
> meaning, that can be documented in a comment someplace:
> ENOMEM - out of memory
> ENODEV - no domain can attach, device or iommu is messed up
> EINVAL - the domain is incompatible with the device
> <others> - Same behavior as ENODEV, use is discouraged.
There are also cases where common kAPIs are called in the attach
path which may return -EINVAL and random errno, e.g.:
omap_iommu_attach_dev()
omap_iommu_attach()
iommu_enable()
pm_runtime_get_sync()
__pm_runtime_resume()
rpm_resume()
if (dev->power.runtime_error) {
retval = -EINVAL;
viommu_attach_dev()
viommu_domain_finalise()
ida_alloc_range()
if ((int)min < 0)
return -ENOSPC;
>
> I think achieving consistency of error codes is a generally desirable
> goal, it makes the error code actually useful.
>
> Joerg this is a good bit of work, will you be OK with it?
>
> > Thus as long as we can maintain that basic guarantee that attaching
> > a group to a newly allocated domain can only ever fail for resource
> > allocation reasons and not some spurious "incompatibility", then we
> > don't need any obscure trickery, and a single, clear, error code is
> > in fact enough to say all that needs to be said.
>
> As above, this is not the case, drivers do seem to have error paths
> that are unconditional on the domain. Perhaps they are just protective
> assertions and never happen.
>
> Regardless, it doesn't matter. If they return ENODEV or EINVAL the
> VFIO side algorithm will continue to work fine, it just does alot more
> work if EINVAL is permanently returned.
>
I don't see an elegant option here.
If we care about accuracy of reporting incompatibility -EMEDIUMTYPE
is obviously a better option.
If we think attach_dev is a slow path and having unnecessary retries
doesn't hurt then -EINVAL sounds a simpler option. We probably can
just go using -EINVAL as retry indicator in vfio even w/o changing
iommu drivers at this point. Then improve them to use consistent
errno gradually and in a separate effort.
Thanks
Kevin
Powered by blists - more mailing lists