[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b5f86fde-eaec-47fc-8b4f-36adb0e9e1a1@intel.com>
Date: Fri, 1 Dec 2023 11:51:26 +0800
From: Yi Liu <yi.l.liu@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>,
Nicolin Chen <nicolinc@...dia.com>
CC: "Tian, Kevin" <kevin.tian@...el.com>,
"joro@...tes.org" <joro@...tes.org>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"robin.murphy@....com" <robin.murphy@....com>,
"baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
"cohuck@...hat.com" <cohuck@...hat.com>,
"eric.auger@...hat.com" <eric.auger@...hat.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"mjrosato@...ux.ibm.com" <mjrosato@...ux.ibm.com>,
"chao.p.peng@...ux.intel.com" <chao.p.peng@...ux.intel.com>,
"yi.y.sun@...ux.intel.com" <yi.y.sun@...ux.intel.com>,
"peterx@...hat.com" <peterx@...hat.com>,
"jasowang@...hat.com" <jasowang@...hat.com>,
"shameerali.kolothum.thodi@...wei.com"
<shameerali.kolothum.thodi@...wei.com>,
"lulu@...hat.com" <lulu@...hat.com>,
"suravee.suthikulpanit@....com" <suravee.suthikulpanit@....com>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
"Duan, Zhenzhong" <zhenzhong.duan@...el.com>,
"joao.m.martins@...cle.com" <joao.m.martins@...cle.com>,
"Zeng, Xin" <xin.zeng@...el.com>,
"Zhao, Yan Y" <yan.y.zhao@...el.com>
Subject: Re: [PATCH v6 2/6] iommufd: Add IOMMU_HWPT_INVALIDATE
On 2023/11/29 08:57, Jason Gunthorpe wrote:
> On Tue, Nov 28, 2023 at 04:51:21PM -0800, Nicolin Chen wrote:
>>>> I also thought about making this out_driver_error_code per HW.
>>>> Yet, an error can be either per array or per entry/quest. The
>>>> array-related error should be reported in the array structure
>>>> that is a core uAPI, v.s. the per-HW entry structure. Though
>>>> we could still report an array error in the entry structure
>>>> at the first entry (or indexed by "array->entry_num")?
>>>>
>>>
>>> why would there be an array error? array is just a software
>>> entity containing actual HW invalidation cmds. If there is
>>> any error with the array itself it should be reported via
>>> ioctl errno.
>>
>> User array reading is a software operation, but kernel array
>> reading is a hardware operation that can raise an error when
>> the memory location to the array is incorrect or so.
>
> Well, we shouldn't get into a situation like that.. By the time the HW
> got the address it should be valid.
>
>> With that being said, I think errno (-EIO) could do the job,
>> as you suggested too.
>
> Do we have any idea what HW failures can be generated by the commands
> this will execture? IIRC I don't remember seeing any smmu specific
> codes related to invalid invalidation? Everything is a valid input?
>
> Can vt-d fail single commands? What about AMD?
Intel VT-d side, after each invalidation request, there is a wait
descriptor which either provide an interrupt or an address for the
hw to notify software the request before the wait descriptor has been
completed. While, if there is error happened on the invalidation request,
a flag (IQE, ICE, ITE) would be set in the Fault Status Register, and some
detailed information would be recorded in the Invalidation Queue Error
Record Register. So an invalidation request may be failed with some error
reported. If no error, will return completion via the wait descriptor. Is
this what you mean by "fail a single command"?
>>> Jason, how about your opinion? I didn't spot big issues
>>> except this one. Hope it can make into 6.8.
>
> Yes, lets try
>
> Jason
--
Regards,
Yi Liu
Powered by blists - more mailing lists