[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250924214215.GR2617119@nvidia.com>
Date: Wed, 24 Sep 2025 18:42:15 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Nicolin Chen <nicolinc@...dia.com>
Cc: will@...nel.org, robin.murphy@....com, joro@...tes.org,
jean-philippe@...aro.org, miko.lenczewski@....com,
balbirs@...dia.com, peterz@...radead.org, smostafa@...gle.com,
kevin.tian@...el.com, praan@...gle.com,
linux-arm-kernel@...ts.infradead.org, iommu@...ts.linux.dev,
linux-kernel@...r.kernel.org, patches@...ts.linux.dev
Subject: Re: [PATCH rfcv2 6/8] iommu/arm-smmu-v3: Populate smmu_domain->invs
when attaching masters
On Mon, Sep 08, 2025 at 04:27:00PM -0700, Nicolin Chen wrote:
> Update the invs array with the invalidations required by each domain type
> during attachment operations.
>
> Only an SVA domain or a paging domain will have an invs array:
> a. SVA domain will add an INV_TYPE_S1_ASID per SMMU and an INV_TYPE_ATS
> per SID
>
> b. Non-nesting-parent paging domain with no ATS-enabled master will add
> a single INV_TYPE_S1_ASID or INV_TYPE_S2_VMID per SMMU
>
> c. Non-nesting-parent paging domain with ATS-enabled master(s) will do
> (b) and add an INV_TYPE_ATS per SID
>
> d. Nesting-parent paging domain will add an INV_TYPE_S2_VMID followed by
> an INV_TYPE_S2_VMID_S1_CLEAR per vSMMU. For an ATS-enabled master, it
> will add an INV_TYPE_ATS_FULL per SID
Just some minor forward looking clarification - this behavior should
be triggered when a nest-parent is attached through the viommu using
a nesting domain with a vSTE.
A nesting-parent that is just normally attached should act like a
normal S2 since it does not and can not have a two stage S1 on top of
it.
We can't quite get there yet until the next series of changing how the
VMID allocation works.
> The per-domain invalidation is not needed, until the domain is attached to
> a master, i.e. a possible translation request. Giving this clears a way to
> allowing the domain to be attached to many SMMUs, and avoids any pointless
> invalidation overheads during a teardown if there are no STE/CDs referring
> to the domain. This also means, when the last device is detached, the old
> domain must flush its ASID or VMID because any iommu_unmap() call after it
> wouldn't initiate any invalidation given an empty domain invs array.
Grammar/phrasing in this paragraph
> Introduce some arm_smmu_invs helper functions for building scratch arrays,
> preparing and installing old/new domain's invalidation arrays.
>
> Co-developed-by: Jason Gunthorpe <jgg@...dia.com>
> Signed-off-by: Jason Gunthorpe <jgg@...dia.com>
> Signed-off-by: Nicolin Chen <nicolinc@...dia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 22 ++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 312 +++++++++++++++++++-
> 2 files changed, 332 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 246c6d84de3ab..e4e0e066108cc 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -678,6 +678,8 @@ struct arm_smmu_inv {
> /**
> * struct arm_smmu_invs - Per-domain invalidation array
> * @num_invs: number of invalidations in the flexible array
> + * @old: flag to synchronize with reader
> + * @rwlock: optional rwlock to fench ATS operations
> * @rcu: rcu head for kfree_rcu()
> * @inv: flexible invalidation array
> *
> @@ -703,6 +705,8 @@ struct arm_smmu_inv {
> */
> struct arm_smmu_invs {
> size_t num_invs;
> + rwlock_t rwlock;
> + u8 old;
> struct rcu_head rcu;
> struct arm_smmu_inv inv[];
> };
> @@ -714,6 +718,7 @@ static inline struct arm_smmu_invs *arm_smmu_invs_alloc(size_t num_invs)
> new_invs = kzalloc(struct_size(new_invs, inv, num_invs), GFP_KERNEL);
> if (!new_invs)
> return ERR_PTR(-ENOMEM);
> + rwlock_init(&new_invs->rwlock);
> new_invs->num_invs = num_invs;
> return new_invs;
> }
Put these and related hunks in the patch adding arm_smmu_invs
> @@ -1183,8 +1183,11 @@ size_t arm_smmu_invs_unref(struct arm_smmu_invs *invs,
> i++;
> } else if (cmp == 0) {
> /* same item */
> - if (refcount_dec_and_test(&invs->inv[i].users))
> + if (refcount_dec_and_test(&invs->inv[i].users)) {
> + /* Notify the caller about this deletion */
> + refcount_set(&to_unref->inv[j].users, 1);
> num_dels++;
This is a bit convoluted. Instead of marking the entry and then
iterating the list again just directly call a function to do the
invalidation.
> +static inline void arm_smmu_invs_dbg(struct arm_smmu_master *master,
> + struct arm_smmu_domain *smmu_domain,
> + struct arm_smmu_invs *invs, char *name)
> +{
> + size_t i;
> +
> + dev_dbg(master->dev, "domain (type: %x), invs: %s, num_invs: %ld\n",
> + smmu_domain->domain.type, name, invs->num_invs);
> + for (i = 0; i < invs->num_invs; i++) {
> + struct arm_smmu_inv *cur = &invs->inv[i];
> +
> + dev_dbg(master->dev,
> + " entry: inv[%ld], type: %u, id: %u, users: %u\n", i,
> + cur->type, cur->id, refcount_read(&cur->users));
> + }
> +}
Move all the debug code to its own commit and don't send it
> +static void
> +arm_smmu_install_new_domain_invs(struct arm_smmu_attach_state *state)
> +{
> + struct arm_smmu_inv_state *invst = &state->new_domain_invst;
> +
> + if (!invst->invs_ptr)
> + return;
> +
> + rcu_assign_pointer(*invst->invs_ptr, invst->new_invs);
> + /*
> + * Committed to updating the STE, using the new invalidation array, and
> + * acquiring any racing IOPTE updates.
> + */
> + smp_mb();
We are commited to updating the STE. Make the invalidation list
visable to parallel map/unmap threads and acquire any racying IOPTE
updates.
> + kfree_rcu(invst->old_invs, rcu);
> +}
> +
> +/*
> + * When an array entry's users count reaches zero, it means the ASID/VMID is no
> + * longer being invalidated by map/unmap and must be cleaned. The rule is that
> + * all ASIDs/VMIDs not in an invalidation array are left cleared in the IOTLB.
> + */
> +static void arm_smmu_invs_flush_iotlb_tags(struct arm_smmu_invs *invs)
> +{
> + size_t i;
> +
> + for (i = 0; i != invs->num_invs; i++) {
Just remove the loop and accept struct arm_smmu_inv * and call it
directly.
> + if (!new_invs) {
> + size_t new_num = old_invs->num_invs;
> +
> + /*
> + * OOM. Couldn't make a copy. Leave the array unoptimized. But
> + * trim its size if some tailing entries are marked as trash.
> + */
> + while (new_num != 0) {
> + if (refcount_read(&old_invs->inv[new_num - 1].users))
> + break;
> + new_num--;
> + }
Would be nicer to have arm_smmu_invs_unref return the new size so we
don't need this loop
Looks Ok to me otherwise
Jason
Powered by blists - more mailing lists