linux-kernel - Re: [PATCH v4 09/12] iommu/vt-d: Add iotlb flush for nested domain

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZNVQcmYp27ap7h30@Asurada-Nvidia>
Date:   Thu, 10 Aug 2023 14:02:42 -0700
From:   Nicolin Chen <nicolinc@...dia.com>
To:     Jason Gunthorpe <jgg@...dia.com>
CC:     "Tian, Kevin" <kevin.tian@...el.com>,
        "Liu, Yi L" <yi.l.liu@...el.com>,
        "joro@...tes.org" <joro@...tes.org>,
        "alex.williamson@...hat.com" <alex.williamson@...hat.com>,
        "robin.murphy@....com" <robin.murphy@....com>,
        "baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
        "cohuck@...hat.com" <cohuck@...hat.com>,
        "eric.auger@...hat.com" <eric.auger@...hat.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "mjrosato@...ux.ibm.com" <mjrosato@...ux.ibm.com>,
        "chao.p.peng@...ux.intel.com" <chao.p.peng@...ux.intel.com>,
        "yi.y.sun@...ux.intel.com" <yi.y.sun@...ux.intel.com>,
        "peterx@...hat.com" <peterx@...hat.com>,
        "jasowang@...hat.com" <jasowang@...hat.com>,
        "shameerali.kolothum.thodi@...wei.com" 
        <shameerali.kolothum.thodi@...wei.com>,
        "lulu@...hat.com" <lulu@...hat.com>,
        "suravee.suthikulpanit@....com" <suravee.suthikulpanit@....com>,
        "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
        "Duan, Zhenzhong" <zhenzhong.duan@...el.com>
Subject: Re: [PATCH v4 09/12] iommu/vt-d: Add iotlb flush for nested domain

On Thu, Aug 10, 2023 at 04:27:02PM -0300, Jason Gunthorpe wrote:
 
> > > Do we need to worry about the ring wrap around? It is already the case
> > > that the VMM has to scan the ring and extract the invalidation
> > > commands, wouldn't it already just linearize them?
> > 
> > I haven't got the chance to send the latest vSMMU series but I
> > pass down the raw user CMDQ to the host to go through, as it'd
> > be easier to stall the consumer index movement when a command
> > in the middle fails.
> 
> Don't some commands have to be executed by the VMM?

Well, they do. VMM would go through the queue and "execute" non-
invalidation commands, then defer the queue to the kernel to go
through the queue once more. So, the flaw could be that some of
the commands behind the failing TLB flush command got "executed",
though in a real case most of other commands would be "executed"
standalone with a CMD_SYNC, i.e. not mixing with any invalidation
command.

> Even so, it seems straightforward enough for the kernel to report the
> number of commands it executed and the VMM can adjust the virtual
> consumer index.

It is not that straightforward to revert an array index back to
a consumer index because they might not be 1:1 mapped, since in
theory there could be other commands mixing in-between, although
it unlikely happens.

So, another index-mapping array would be needed for this matter.
And this doesn't address the flaw that I mentioned above either.
So, I took the former solution to reduce the complication.

> > > Is there a use case for invaliation only SW emulated rings, and do we
> > > care about optimizing for the wrap around case?
> > 
> > Hmm, why a SW emulated ring?
> 
> That is what you are building. The VMM catches the write of the
> producer pointer and the VMM SW bundles it up to call into the kernel.

Still not fully getting it. Do you mean a ring that is prepared
by the VMM? I think the only case that we need to handle a ring
is what I did by forwarding the guest CMDQ (a ring) to the host
directly. Not sure why VMM would need another ring for those
linearized invalidation commands. Or maybe I misunderstood..

> > Yes for the latter question. SMMU kernel driver has something
> > like Q_WRP and other helpers, so it wasn't difficult to process
> > the user CMDQ in the same raw form. But it does complicates the
> > common code if we want to do it there.
> 
> Optimizing wrap around means when the producer/consumer pointers pass
> the end of the queue memory we execute one, not two ioctls toward the
> kernel. That is possible a very minor optimization, it depends how big
> the queues are and how frequent multi-entry items will be present.

There could be other commands being issued by other VMs or even
the host between the two ioctls. So probably we'd need to handle
the wrapping case when doing a ring solution?

Thanks
Nicolin