[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZBJcS07G3mt7gjkA@Asurada-Nvidia>
Date: Wed, 15 Mar 2023 17:01:15 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Robin Murphy <robin.murphy@....com>
CC: <jgg@...dia.com>, <will@...nel.org>, <eric.auger@...hat.com>,
<kevin.tian@...el.com>, <baolu.lu@...ux.intel.com>,
<joro@...tes.org>, <shameerali.kolothum.thodi@...wei.com>,
<jean-philippe@...aro.org>, <linux-arm-kernel@...ts.infradead.org>,
<iommu@...ts.linux.dev>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1 14/14] iommu/arm-smmu-v3: Add
arm_smmu_cache_invalidate_user
On Mon, Mar 13, 2023 at 01:07:42PM +0000, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2023-03-11 12:38, Nicolin Chen wrote:
> > On Fri, Mar 10, 2023 at 05:53:46PM +0000, Robin Murphy wrote:
> >
> > > > > > + case CMDQ_OP_TLBI_NH_VA:
> > > > > > + cmd.tlbi.asid = inv_info->asid;
> > > > > > + fallthrough;
> > > > > > + case CMDQ_OP_TLBI_NH_VAA:
> > > > > > + if (!granule_size || !(granule_size & smmu->pgsize_bitmap) ||
> > > > >
> > > > > Non-range invalidations with TG=0 are perfectly legal, and should not be
> > > > > ignored.
> > > >
> > > > I assume that you are talking about the pgsize_bitmap check.
> > > >
> > > > QEMU embeds a !tg case into the granule_size [1]. So it might
> > > > not be straightforward to cover that case. Let me see how to
> > > > untangle different cases and handle them accordingly.
> > >
> > > Oh, double-checking patch #2, that might be me misunderstanding the
> > > interface. I hadn't realised that the UAPI was apparently modelled on
> > > arm_smmu_tlb_inv_range_asid() rather than actual SMMU commands :)
> >
> > Yea. In fact, most of the invalidation info in QEMU was packed
> > for the previously defined general cache invalidation structure,
> > and the range invalidation part is still not quite independent.
> >
> > > I really think UAPI should reflect the hardware and encode TG and TTL
> > > directly. Especially since there's technically a flaw in the current
> > > driver where we assume TTL in cases where it isn't actually known, thus
> > > may potentially fail to invalidate level 2 block entries when removing a
> > > level 1 table, since io-pgtable passes the level 3 granule in that case.
> >
> > Do you mean something like hw_info forwarding pgsize_bitmap/tg
> > to the guest? Or the other direction?
>
> I mean if the interface wants to support range invalidations in a way
> which works correctly, then it should ideally carry both the TG and TTL
> fields from the guest command straight through to the host. If not, then
> at the very least the host must always assume TTL=0, because it cannot
> correctly infer otherwise once the guest command's original intent has
> been lost.
Oh, it's about hypervisor simply forwarding the entire CMD to
the host side. Jason is suggesting a fast approach by letting
host kernel read the CMDQ directly to get the raw CMD. Perhaps
that would address this comments about TG/TTL too.
I wonder if there could be other case than a WAR, where TG/TTL
fields from the guest's aren't supported by the host. And then
should the host handle it with a different CMD?
> > > When range invalidation came along, the distinction between "all leaves
> > > are definitely at the last level" and "use last-level granularity to
> > > make sure everything at at any level is hit" started to matter, but the
> > > interface never caught up. It hasn't seemed desperately urgent to fix
> > > (who does 1GB+ unmaps outside of VFIO teardown anyway?), but we must
> > > definitely not bake the same mistake into user ABI.
> > >
> > > Of course, there might then be cases where we need to transform
> > > non-range commands into range commands for the sake of workarounds, but
> > > that's our own problem to deal with.
> >
> > Noted it down.
> >
> > > > > What about NSNH_ALL? That still needs to invalidate all the S1 context
> > > > > that the guest *thinks* it's invalidating.
> > > >
> > > > NSNH_ALL is translated to NH_ALL at the guest level. But maybe
> > > > it should have been done here instead.
> > >
> > > Yes. It seems the worst of both worlds to have an interface which takes
> > > raw opcodes rather than an enum of supported commands, but still
> > > requires userspace to know which opcodes are supported and which ones
> > > don't work as expected even though they are entirely reasonable to use
> > > in the context of the stage-1-only SMMU being emulated.
> >
> > Maybe a list of supported TLBI commands via the hw_info uAPI?
>
> I don't think it's all that difficult to implicitly support all commands
> that are valid for a stage-1-only SMMU, it just needs the right
> interface design to be capable of encoding them all completely and
> unambiguously. Coming back to the previous point about the address
> encoding, I think that means basing it more directly on the actual
> SMMUv3 commands, rather than on io-pgtable's abstraction of invalidation
> with SMMUv3 opcodes bolted on.
Yea, with the actual commands from the guest, the host can do
something with its supported commands, I think.
Thanks
Nicolin
Powered by blists - more mailing lists