[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260126160940.GM1134360@nvidia.com>
Date: Mon, 26 Jan 2026 12:09:40 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Will Deacon <will@...nel.org>
Cc: Nicolin Chen <nicolinc@...dia.com>, jean-philippe@...aro.org,
robin.murphy@....com, joro@...tes.org, balbirs@...dia.com,
miko.lenczewski@....com, peterz@...radead.org, kevin.tian@...el.com,
praan@...gle.com, linux-arm-kernel@...ts.infradead.org,
iommu@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v9 6/7] iommu/arm-smmu-v3: Add arm_smmu_invs based
arm_smmu_domain_inv_range()
On Mon, Jan 26, 2026 at 04:02:19PM +0000, Will Deacon wrote:
> On Mon, Jan 26, 2026 at 11:20:19AM -0400, Jason Gunthorpe wrote:
> > On Mon, Jan 26, 2026 at 01:01:16PM +0000, Will Deacon wrote:
> > > On Fri, Jan 23, 2026 at 04:03:27PM -0400, Jason Gunthorpe wrote:
> > > > On Fri, Jan 23, 2026 at 05:10:52PM +0000, Will Deacon wrote:
> > > > > On Fri, Jan 23, 2026 at 05:05:31PM +0000, Will Deacon wrote:
> > > > > > On Fri, Dec 19, 2025 at 12:11:28PM -0800, Nicolin Chen wrote:
> > > > > > > + /*
> > > > > > > + * We are committed to updating the STE. Ensure the invalidation array
> > > > > > > + * is visible to concurrent map/unmap threads, and acquire any racing
> > > > > > > + * IOPTE updates.
> > > > > > > + *
> > > > > > > + * [CPU0] | [CPU1]
> > > > > > > + * |
> > > > > > > + * change IOPTEs and TLB flush: |
> > > > > > > + * arm_smmu_domain_inv_range() { | arm_smmu_install_old_domain_invs {
> > > > > > > + * ... | rcu_assign_pointer(new_invs);
> > > > > > > + * smp_mb(); // ensure IOPTEs | smp_mb(); // ensure new_invs
> > > > > > > + * ... | kfree_rcu(old_invs, rcu);
> > > > > > > + * // load invalidation array | }
> > > > > > > + * invs = rcu_dereference(); | arm_smmu_install_ste_for_dev {
> > > > > > > + * | STE = TTB0 // read new IOPTEs
> > > > > > > + */
> > > > > > > + smp_mb();
> If we do that, can we drop the smp_mb()s from
> arm_smmu_install_{old,new}_domain_invs()?
I suppose so, but domain attach isn't a performance path so it depends
on your preference for strict pairing of barriers. Currently the two
smp_mbs() are paired. Can we reliably pair smp_mb() with dma_wmb()?
Are you happy with that clarity?
My view is attach isn't a performance path, so having extra barriers
is fine if it helps understandability.
Jason
Powered by blists - more mailing lists