[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251216140410.GV3707837@noisy.programming.kicks-ass.net>
Date: Tue, 16 Dec 2025 15:04:10 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Nicolin Chen <nicolinc@...dia.com>, will@...nel.org,
jean-philippe@...aro.org, robin.murphy@....com, joro@...tes.org,
balbirs@...dia.com, miko.lenczewski@....com, kevin.tian@...el.com,
praan@...gle.com, linux-arm-kernel@...ts.infradead.org,
iommu@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 6/7] iommu/arm-smmu-v3: Add arm_smmu_invs based
arm_smmu_domain_inv_range()
On Tue, Dec 16, 2025 at 09:56:13AM -0400, Jason Gunthorpe wrote:
> On Tue, Dec 16, 2025 at 10:09:26AM +0100, Peter Zijlstra wrote:
> > Anyway, if I understand the above correctly, the smb_mb() is for:
> >
> > arm_smmu_domain_inv_range() arm_smmu_install_new_domain_invs()
> >
> > [W] IOPTE [Wrel] smmu_domain->invs
> > smp_mb() smp_mb()
> > [Lacq] smmu_domain->invs [L] IOPTE
> >
> > Right? But I'm not sure about your 'HW sees the new IOPTEs' claim;
>
> Yes, the '[L] IOPTE' would be a DMA from HW.
>
> > that very much depend on what coherency domain the relevant hardware
> > plays in. For smp_mb() to work, the hardware must be in the ISH
> > domain, while typically devices are (if I remember my arrrrgh64
> > correctly) in the OSH.
>
> The '[W] IOPTE' sequence already includes a cache flush if the
> inner/outer sharable are not coherent. If a cache flush was required
> then the smp_mb() must also order it, otherwise it just has to order
> the store.
>
> The page table table code has always relied on this kind of ordering
> with respect to DMA working, it would be completely broken if the DMA
> does not order with the barriers.
>
> For example:
>
> CPU0 CPU1
> store PMD
> read PMD
> store PTE 1 store PTE 2
> dma memory barrier
> device reads 2
> dma memory barrier
> device reads 1
But here you have dma_mb(), which is dmb(osh).
> The 'device reads 2' thread must be guarenteed that the HW DMA
> observes the PMD stored by CPU0. It relies on the same kind of
> explicit cache flushing and barriers as this patch does.
OK, but then please include that in the comment, because using smp_*()
barriers and talking about devices sets of alarm bells.
Powered by blists - more mailing lists