lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGsJ_4zyascnpQ1cB-BMO9PDeeRZTBAh8Z-j-ip=RcxApa4zSg@mail.gmail.com>
Date: Sun, 28 Dec 2025 10:45:13 +1300
From: Barry Song <21cnbao@...il.com>
To: Leon Romanovsky <leon@...nel.org>
Cc: catalin.marinas@....com, m.szyprowski@...sung.com, robin.murphy@....com, 
	will@...nel.org, iommu@...ts.linux.dev, linux-arm-kernel@...ts.infradead.org, 
	linux-kernel@...r.kernel.org, xen-devel@...ts.xenproject.org, 
	Ada Couprie Diaz <ada.coupriediaz@....com>, Ard Biesheuvel <ardb@...nel.org>, Marc Zyngier <maz@...nel.org>, 
	Anshuman Khandual <anshuman.khandual@....com>, Ryan Roberts <ryan.roberts@....com>, 
	Suren Baghdasaryan <surenb@...gle.com>, Joerg Roedel <joro@...tes.org>, Juergen Gross <jgross@...e.com>, 
	Stefano Stabellini <sstabellini@...nel.org>, 
	Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>, Tangquan Zheng <zhengtangquan@...o.com>
Subject: Re: [PATCH v2 4/8] dma-mapping: Separate DMA sync issuing and
 completion waiting

On Sun, Dec 28, 2025 at 9:07 AM Leon Romanovsky <leon@...nel.org> wrote:
>
> On Sat, Dec 27, 2025 at 11:52:44AM +1300, Barry Song wrote:
> > From: Barry Song <baohua@...nel.org>
> >
> > Currently, arch_sync_dma_for_cpu and arch_sync_dma_for_device
> > always wait for the completion of each DMA buffer. That is,
> > issuing the DMA sync and waiting for completion is done in a
> > single API call.
> >
> > For scatter-gather lists with multiple entries, this means
> > issuing and waiting is repeated for each entry, which can hurt
> > performance. Architectures like ARM64 may be able to issue all
> > DMA sync operations for all entries first and then wait for
> > completion together.
> >
> > To address this, arch_sync_dma_for_* now issues DMA operations in
> > batch, followed by a flush. On ARM64, the flush is implemented
> > using a dsb instruction within arch_sync_dma_flush().
> >
> > For now, add arch_sync_dma_flush() after each
> > arch_sync_dma_for_*() call. arch_sync_dma_flush() is defined as a
> > no-op on all architectures except arm64, so this patch does not
> > change existing behavior. Subsequent patches will introduce true
> > batching for SG DMA buffers.
> >
> > Cc: Leon Romanovsky <leon@...nel.org>
> > Cc: Catalin Marinas <catalin.marinas@....com>
> > Cc: Will Deacon <will@...nel.org>
> > Cc: Marek Szyprowski <m.szyprowski@...sung.com>
> > Cc: Robin Murphy <robin.murphy@....com>
> > Cc: Ada Couprie Diaz <ada.coupriediaz@....com>
> > Cc: Ard Biesheuvel <ardb@...nel.org>
> > Cc: Marc Zyngier <maz@...nel.org>
> > Cc: Anshuman Khandual <anshuman.khandual@....com>
> > Cc: Ryan Roberts <ryan.roberts@....com>
> > Cc: Suren Baghdasaryan <surenb@...gle.com>
> > Cc: Joerg Roedel <joro@...tes.org>
> > Cc: Juergen Gross <jgross@...e.com>
> > Cc: Stefano Stabellini <sstabellini@...nel.org>
> > Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>
> > Cc: Tangquan Zheng <zhengtangquan@...o.com>
> > Signed-off-by: Barry Song <baohua@...nel.org>
> > ---
> >  arch/arm64/include/asm/cache.h |  6 ++++++
> >  arch/arm64/mm/dma-mapping.c    |  4 ++--
> >  drivers/iommu/dma-iommu.c      | 37 +++++++++++++++++++++++++---------
> >  drivers/xen/swiotlb-xen.c      | 24 ++++++++++++++--------
> >  include/linux/dma-map-ops.h    |  6 ++++++
> >  kernel/dma/direct.c            |  8 ++++++--
> >  kernel/dma/direct.h            |  9 +++++++--
> >  kernel/dma/swiotlb.c           |  4 +++-
> >  8 files changed, 73 insertions(+), 25 deletions(-)
>
> <...>
>
> > +#ifndef arch_sync_dma_flush
> > +static inline void arch_sync_dma_flush(void)
> > +{
> > +}
> > +#endif
>
> Over the weekend I realized a useful advantage of the ARCH_HAVE_* config
> options: they make it straightforward to inspect the entire DMA path simply
> by looking at the .config.

I am not quite sure how much this benefits users, as the same
information could also be obtained by grepping for
#define arch_sync_dma_flush in the source code.

>
> Thanks,
> Reviewed-by: Leon Romanovsky <leonro@...dia.com>

Thanks very much, Leon, for reviewing this over the weekend. One thing
you might have missed is that I place arch_sync_dma_flush() after all
arch_sync_dma_for_*() calls, for both single and sg cases. I also
used a Python script to scan the code and verify that every
arch_sync_dma_for_*() is followed by arch_sync_dma_flush(), to ensure
that no call is left out.

In the subsequent patches, for sg cases, the per-entry flush is
replaced by a single flush of the entire sg. Each sg case has
different characteristics: some are straightforward, while others
can be tricky and involve additional contexts.

Thanks
Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ