lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251216135613.GB6079@nvidia.com>
Date: Tue, 16 Dec 2025 09:56:13 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Nicolin Chen <nicolinc@...dia.com>, will@...nel.org,
	jean-philippe@...aro.org, robin.murphy@....com, joro@...tes.org,
	balbirs@...dia.com, miko.lenczewski@....com, kevin.tian@...el.com,
	praan@...gle.com, linux-arm-kernel@...ts.infradead.org,
	iommu@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 6/7] iommu/arm-smmu-v3: Add arm_smmu_invs based
 arm_smmu_domain_inv_range()

On Tue, Dec 16, 2025 at 10:09:26AM +0100, Peter Zijlstra wrote:
> Anyway, if I understand the above correctly, the smb_mb() is for:
> 
>   arm_smmu_domain_inv_range() 		arm_smmu_install_new_domain_invs()
> 
>     [W] IOPTE				  [Wrel] smmu_domain->invs
>     smp_mb()				  smp_mb()
>     [Lacq] smmu_domain->invs		  [L] IOPTE
> 
> Right? But I'm not sure about your 'HW sees the new IOPTEs' claim;

Yes, the '[L] IOPTE' would be a DMA from HW.

> that very much depend on what coherency domain the relevant hardware
> plays in. For smp_mb() to work, the hardware must be in the ISH
> domain, while typically devices are (if I remember my arrrrgh64
> correctly) in the OSH.

The '[W] IOPTE' sequence already includes a cache flush if the
inner/outer sharable are not coherent. If a cache flush was required
then the smp_mb() must also order it, otherwise it just has to order
the store.

The page table table code has always relied on this kind of ordering
with respect to DMA working, it would be completely broken if the DMA
does not order with the barriers.

For example:

            CPU0                         CPU1
   store PMD
                                        read PMD
   store PTE 1                          store PTE 2
   	     				dma memory barrier
                                        device reads 2
   dma memory barrier
   device reads 1


The 'device reads 2' thread must be guarenteed that the HW DMA
observes the PMD stored by CPU0. It relies on the same kind of
explicit cache flushing and barriers as this patch does.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ