[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251216141511.GD6079@nvidia.com>
Date: Tue, 16 Dec 2025 10:15:11 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Nicolin Chen <nicolinc@...dia.com>, will@...nel.org,
jean-philippe@...aro.org, robin.murphy@....com, joro@...tes.org,
balbirs@...dia.com, miko.lenczewski@....com, kevin.tian@...el.com,
praan@...gle.com, linux-arm-kernel@...ts.infradead.org,
iommu@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 6/7] iommu/arm-smmu-v3: Add arm_smmu_invs based
arm_smmu_domain_inv_range()
On Tue, Dec 16, 2025 at 03:04:10PM +0100, Peter Zijlstra wrote:
> On Tue, Dec 16, 2025 at 09:56:13AM -0400, Jason Gunthorpe wrote:
> > On Tue, Dec 16, 2025 at 10:09:26AM +0100, Peter Zijlstra wrote:
> > > Anyway, if I understand the above correctly, the smb_mb() is for:
> > >
> > > arm_smmu_domain_inv_range() arm_smmu_install_new_domain_invs()
> > >
> > > [W] IOPTE [Wrel] smmu_domain->invs
> > > smp_mb() smp_mb()
> > > [Lacq] smmu_domain->invs [L] IOPTE
> > >
> > > Right? But I'm not sure about your 'HW sees the new IOPTEs' claim;
> >
> > Yes, the '[L] IOPTE' would be a DMA from HW.
> >
> > > that very much depend on what coherency domain the relevant hardware
> > > plays in. For smp_mb() to work, the hardware must be in the ISH
> > > domain, while typically devices are (if I remember my arrrrgh64
> > > correctly) in the OSH.
> >
> > The '[W] IOPTE' sequence already includes a cache flush if the
> > inner/outer sharable are not coherent. If a cache flush was required
> > then the smp_mb() must also order it, otherwise it just has to order
> > the store.
> >
> > The page table table code has always relied on this kind of ordering
> > with respect to DMA working, it would be completely broken if the DMA
> > does not order with the barriers.
> >
> > For example:
> >
> > CPU0 CPU1
> > store PMD
> > read PMD
> > store PTE 1 store PTE 2
> > dma memory barrier
> > device reads 2
> > dma memory barrier
> > device reads 1
>
> But here you have dma_mb(), which is dmb(osh).
This also has a pre-existing dma_wmb(), a fuller chart would be like this:
arm_smmu_domain_inv_range() arm_smmu_install_new_domain_invs()
[W] IOPTE [Wrel] smmu_domain->invs
smp_mb() smp_mb() <--- arm_smmu_install_new_domain_invs()
[build the STE]
[post the STE]
dma_wmb() <--- arm_smmu_cmdq_issue_cmdlist()
Doorbell Write to Device
[Lacq] smmu_domain->invs [L] IOPTE via DMA
Nicolin, please elabortate more of these details in the comment.
Jason
Powered by blists - more mailing lists