[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZBiF1hZVrp/aJRM6@Asurada-Nvidia>
Date: Mon, 20 Mar 2023 09:12:06 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: "Tian, Kevin" <kevin.tian@...el.com>,
Jason Gunthorpe <jgg@...dia.com>
CC: Robin Murphy <robin.murphy@....com>,
"will@...nel.org" <will@...nel.org>,
"eric.auger@...hat.com" <eric.auger@...hat.com>,
"baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
"joro@...tes.org" <joro@...tes.org>,
"shameerali.kolothum.thodi@...wei.com"
<shameerali.kolothum.thodi@...wei.com>,
"jean-philippe@...aro.org" <jean-philippe@...aro.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1 14/14] iommu/arm-smmu-v3: Add
arm_smmu_cache_invalidate_user
On Mon, Mar 20, 2023 at 09:59:23AM -0300, Jason Gunthorpe wrote:
> On Fri, Mar 17, 2023 at 09:41:34AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@...dia.com>
> > > Sent: Saturday, March 11, 2023 12:20 AM
> > >
> > > What I'm broadly thinking is if we have to make the infrastructure for
> > > VCMDQ HW accelerated invalidation then it is not a big step to also
> > > have the kernel SW path use the same infrastructure just with a CPU
> > > wake up instead of a MMIO poke.
> > >
> > > Ie we have a SW version of VCMDQ to speed up SMMUv3 cases without HW
> > > support.
> > >
> >
> > I thought about this in VT-d context. Looks there are some difficulties.
> >
> > The most prominent one is that head/tail of the VT-d invalidation queue
> > are in MMIO registers. Handling it in kernel iommu driver suggests
> > reading virtual tail register and updating virtual head register. Kind of
> > moving some vIOMMU awareness into the kernel which, iirc, is not
> > a welcomed model.
>
> qemu would trap the MMIO and generate an IOCTL with the written head
> pointer. It isn't as efficient as having the kernel do the trap, but
> does give batching.
Rephrasing that to put into a design: the IOCTL would pass a
user pointer to the queue, the size of the queue, then a head
pointer and a tail pointer? Then the kernel reads out all the
commands between the head and the tail and handles all those
invalidation commands only?
Thanks
Nic
Powered by blists - more mailing lists