lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240324232215.GC20765@lst.de>
Date: Mon, 25 Mar 2024 00:22:15 +0100
From: Christoph Hellwig <hch@....de>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Christoph Hellwig <hch@....de>, Leon Romanovsky <leon@...nel.org>,
	Robin Murphy <robin.murphy@....com>,
	Marek Szyprowski <m.szyprowski@...sung.com>,
	Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
	Chaitanya Kulkarni <chaitanyak@...dia.com>,
	Jonathan Corbet <corbet@....net>, Jens Axboe <axboe@...nel.dk>,
	Keith Busch <kbusch@...nel.org>, Sagi Grimberg <sagi@...mberg.me>,
	Yishai Hadas <yishaih@...dia.com>,
	Shameer Kolothum <shameerali.kolothum.thodi@...wei.com>,
	Kevin Tian <kevin.tian@...el.com>,
	Alex Williamson <alex.williamson@...hat.com>,
	Jérôme Glisse <jglisse@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-block@...r.kernel.org, linux-rdma@...r.kernel.org,
	iommu@...ts.linux.dev, linux-nvme@...ts.infradead.org,
	kvm@...r.kernel.org, linux-mm@...ck.org,
	Bart Van Assche <bvanassche@....org>,
	Damien Le Moal <damien.lemoal@...nsource.wdc.com>,
	Amir Goldstein <amir73il@...il.com>,
	"josef@...icpanda.com" <josef@...icpanda.com>,
	"Martin K. Petersen" <martin.petersen@...cle.com>,
	"daniel@...earbox.net" <daniel@...earbox.net>,
	Dan Williams <dan.j.williams@...el.com>,
	"jack@...e.com" <jack@...e.com>, Zhu Yanjun <zyjzyj2000@...il.com>
Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to two
 steps

On Fri, Mar 22, 2024 at 03:43:30PM -0300, Jason Gunthorpe wrote:
> If we are going to make caller provided uniformity a requirement, lets
> imagine a formal memory type idea to help keep this a little
> abstracted?
> 
>  DMA_MEMORY_TYPE_NORMAL
>  DMA_MEMORY_TYPE_P2P_NOT_ACS
>  DMA_MEMORY_TYPE_ENCRYPTED
>  DMA_MEMORY_TYPE_BOUNCE_BUFFER  // ??
> 
> Then maybe the driver flow looks like:
> 
> 	if (transaction.memory_type == DMA_MEMORY_TYPE_NORMAL && dma_api_has_iommu(dev)) {

Add a nice helper to make this somewhat readable, but yes.

> 	} else if (transaction.memory_type == DMA_MEMORY_TYPE_P2P_NOT_ACS) {
> 		num_hwsgls = transcation.num_sgls;
> 		for_each_range(transaction, range) {
> 			hwsgl[i].addr = dma_api_p2p_not_acs_map(range.start_physical, range.length, p2p_memory_provider);
> 			hwsgl[i].len = range.size;
> 		}
> 	} else {
> 		/* Must be DMA_MEMORY_TYPE_NORMAL, DMA_MEMORY_TYPE_ENCRYPTED, DMA_MEMORY_TYPE_BOUNCE_BUFFER? */
> 		num_hwsgls = transcation.num_sgls;
> 		for_each_range(transaction, range) {
> 			hwsgl[i].addr = dma_api_map_cpu_page(range.start_page, range.length);
> 			hwsgl[i].len = range.size;
> 		}
>

And these two are really the same except that we call a different map
helper underneath.  So I think as far as the driver is concerned
they should be the same, the DMA API just needs to key off the
memory tap.

> And the hmm_range_fault case is sort of like:
> 
> 		struct dma_api_iommu_state state;
> 		dma_api_iommu_start(&state, mr.num_pages);
> 
> 		[..]
> 		hmm_range_fault(...)
> 		if (present)
> 			dma_link_page(&state, faulting_address_offset, page);
> 		else
> 			dma_unlink_page(&state, faulting_address_offset, page);
> 
> Is this looking closer?

Yes.

> > > So I take it as a requirement that RDMA MUST make single MR's out of a
> > > hodgepodge of page types. RDMA MRs cannot be split. Multiple MR's are
> > > not a functional replacement for a single MR.
> > 
> > But MRs consolidate multiple dma addresses anyway.
> 
> I'm not sure I understand this?

The RDMA MRs take a a list of PFNish address, (or SGLs with the
enhanced MRs from Mellanox) and give you back a single rkey/lkey.

> To go back to my main thesis - I would like a high performance low
> level DMA API that is capable enough that it could implement
> scatterlist dma_map_sg() and thus also implement any future
> scatterlist_v2, bio, hmm_range_fault or any other thing we come up
> with on top of it. This is broadly what I thought we agreed to at LSF
> last year.

I think the biggest underlying problem of the scatterlist based
DMA implementation for IOMMUs is that it's trying to handle to much,
that is magic coalescing even if the segments boundaries don't align
with the IOMMU page size.  If we can get rid of that misfeature I
think we'd greatly simply the API and implementation.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ