lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <761986ec0f404856b6f21c3feca67012@honor.com>
Date: Mon, 9 Jun 2025 09:32:20 +0000
From: wangtao <tao.wangtao@...or.com>
To: Christoph Hellwig <hch@...radead.org>, Christian König
	<christian.koenig@....com>
CC: "sumit.semwal@...aro.org" <sumit.semwal@...aro.org>, "kraxel@...hat.com"
	<kraxel@...hat.com>, "vivek.kasireddy@...el.com" <vivek.kasireddy@...el.com>,
	"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>, "brauner@...nel.org"
	<brauner@...nel.org>, "hughd@...gle.com" <hughd@...gle.com>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "amir73il@...il.com"
	<amir73il@...il.com>, "benjamin.gaignard@...labora.com"
	<benjamin.gaignard@...labora.com>, "Brian.Starkey@....com"
	<Brian.Starkey@....com>, "jstultz@...gle.com" <jstultz@...gle.com>,
	"tjmercier@...gle.com" <tjmercier@...gle.com>, "jack@...e.cz" <jack@...e.cz>,
	"baolin.wang@...ux.alibaba.com" <baolin.wang@...ux.alibaba.com>,
	"linux-media@...r.kernel.org" <linux-media@...r.kernel.org>,
	"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
	"linaro-mm-sig@...ts.linaro.org" <linaro-mm-sig@...ts.linaro.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, "wangbintian(BintianWang)"
	<bintian.wang@...or.com>, yipengxiang <yipengxiang@...or.com>, liulu 00013167
	<liulu.liu@...or.com>, hanfeng 00012985 <feng.han@...or.com>
Subject: RE: [PATCH v4 0/4] Implement dmabuf direct I/O via copy_file_range



> -----Original Message-----
> From: Christoph Hellwig <hch@...radead.org>
> Sent: Monday, June 9, 2025 12:35 PM
> To: Christian König <christian.koenig@....com>
> Cc: wangtao <tao.wangtao@...or.com>; Christoph Hellwig
> <hch@...radead.org>; sumit.semwal@...aro.org; kraxel@...hat.com;
> vivek.kasireddy@...el.com; viro@...iv.linux.org.uk; brauner@...nel.org;
> hughd@...gle.com; akpm@...ux-foundation.org; amir73il@...il.com;
> benjamin.gaignard@...labora.com; Brian.Starkey@....com;
> jstultz@...gle.com; tjmercier@...gle.com; jack@...e.cz;
> baolin.wang@...ux.alibaba.com; linux-media@...r.kernel.org; dri-
> devel@...ts.freedesktop.org; linaro-mm-sig@...ts.linaro.org; linux-
> kernel@...r.kernel.org; linux-fsdevel@...r.kernel.org; linux-
> mm@...ck.org; wangbintian(BintianWang) <bintian.wang@...or.com>;
> yipengxiang <yipengxiang@...or.com>; liulu 00013167
> <liulu.liu@...or.com>; hanfeng 00012985 <feng.han@...or.com>
> Subject: Re: [PATCH v4 0/4] Implement dmabuf direct I/O via
> copy_file_range
> 
> On Fri, Jun 06, 2025 at 01:20:48PM +0200, Christian König wrote:
> > > dmabuf acts as a driver and shouldn't be handled by VFS, so I made
> > > dmabuf implement copy_file_range callbacks to support direct I/O
> > > zero-copy. I'm open to both approaches. What's the preference of VFS
> > > experts?
> >
> > That would probably be illegal. Using the sg_table in the DMA-buf
> > implementation turned out to be a mistake.
> 
> Two thing here that should not be directly conflated.  Using the sg_table was
> a huge mistake, and we should try to move dmabuf to switch that to a pure
I'm a bit confused: don't dmabuf importers need to traverse sg_table to
access folios or dma_addr/len? Do you mean restricting sg_table access
(e.g., only via iov_iter) or proposing alternative approaches?

> dma_addr_t/len array now that the new DMA API supporting that has been
> merged.  Is there any chance the dma-buf maintainers could start to kick this
> off?  I'm of course happy to assist.
> 
> But that notwithstanding, dma-buf is THE buffer sharing mechanism in the
> kernel, and we should promote it instead of reinventing it badly.
> And there is a use case for having a fully DMA mapped buffer in the block
> layer and I/O path, especially on systems with an IOMMU.
> So having an iov_iter backed by a dma-buf would be extremely helpful.
> That's mostly lib/iov_iter.c code, not VFS, though.
Are you suggesting adding an ITER_DMABUF type to iov_iter, or
implementing dmabuf-to-iov_bvec conversion within iov_iter?

> 
> > The question Christoph raised was rather why is your CPU so slow that
> > walking the page tables has a significant overhead compared to the
> > actual I/O?
> 
> Yes, that's really puzzling and should be addressed first.
With high CPU performance (e.g., 3GHz), GUP (get_user_pages) overhead
is relatively low (observed in 3GHz tests).
|    32x32MB Read 1024MB    |Creat-ms|Close-ms|  I/O-ms|I/O-MB/s| I/O%
|---------------------------|--------|--------|--------|--------|-----
| 1)        memfd direct R/W|      1 |    118 |    312 |   3448 | 100%
| 2)      u+memfd direct R/W|    196 |    123 |    295 |   3651 | 105%
| 3) u+memfd direct sendfile|    175 |    102 |    976 |   1100 |  31%
| 4)   u+memfd direct splice|    173 |    103 |    443 |   2428 |  70%
| 5)      udmabuf buffer R/W|    183 |    100 |    453 |   2375 |  68%
| 6)       dmabuf buffer R/W|     34 |      4 |    427 |   2519 |  73%
| 7)    udmabuf direct c_f_r|    200 |    102 |    278 |   3874 | 112%
| 8)     dmabuf direct c_f_r|     36 |      5 |    269 |   4002 | 116%

With lower CPU performance (e.g., 1GHz), GUP overhead becomes more
significant (as seen in 1GHz tests).
|    32x32MB Read 1024MB    |Creat-ms|Close-ms|  I/O-ms|I/O-MB/s| I/O%
|---------------------------|--------|--------|--------|--------|-----
| 1)        memfd direct R/W|      2 |    393 |    969 |   1109 | 100%
| 2)      u+memfd direct R/W|    592 |    424 |    570 |   1884 | 169%
| 3) u+memfd direct sendfile|    587 |    356 |   2229 |    481 |  43%
| 4)   u+memfd direct splice|    568 |    352 |    795 |   1350 | 121%
| 5)      udmabuf buffer R/W|    597 |    343 |   1238 |    867 |  78%
| 6)       dmabuf buffer R/W|     69 |     13 |   1128 |    952 |  85%
| 7)    udmabuf direct c_f_r|    595 |    345 |    372 |   2889 | 260%
| 8)     dmabuf direct c_f_r|     80 |     13 |    274 |   3929 | 354%

Regards,
Wangtao.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ