[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEZkjA1L-dP_Qt3U@infradead.org>
Date: Sun, 8 Jun 2025 21:35:24 -0700
From: Christoph Hellwig <hch@...radead.org>
To: Christian König <christian.koenig@....com>
Cc: wangtao <tao.wangtao@...or.com>, Christoph Hellwig <hch@...radead.org>,
"sumit.semwal@...aro.org" <sumit.semwal@...aro.org>,
"kraxel@...hat.com" <kraxel@...hat.com>,
"vivek.kasireddy@...el.com" <vivek.kasireddy@...el.com>,
"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
"brauner@...nel.org" <brauner@...nel.org>,
"hughd@...gle.com" <hughd@...gle.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"amir73il@...il.com" <amir73il@...il.com>,
"benjamin.gaignard@...labora.com" <benjamin.gaignard@...labora.com>,
"Brian.Starkey@....com" <Brian.Starkey@....com>,
"jstultz@...gle.com" <jstultz@...gle.com>,
"tjmercier@...gle.com" <tjmercier@...gle.com>,
"jack@...e.cz" <jack@...e.cz>,
"baolin.wang@...ux.alibaba.com" <baolin.wang@...ux.alibaba.com>,
"linux-media@...r.kernel.org" <linux-media@...r.kernel.org>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
"linaro-mm-sig@...ts.linaro.org" <linaro-mm-sig@...ts.linaro.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"wangbintian(BintianWang)" <bintian.wang@...or.com>,
yipengxiang <yipengxiang@...or.com>,
liulu 00013167 <liulu.liu@...or.com>,
hanfeng 00012985 <feng.han@...or.com>
Subject: Re: [PATCH v4 0/4] Implement dmabuf direct I/O via copy_file_range
On Fri, Jun 06, 2025 at 01:20:48PM +0200, Christian König wrote:
> > dmabuf acts as a driver and shouldn't be handled by VFS, so I made
> > dmabuf implement copy_file_range callbacks to support direct I/O
> > zero-copy. I'm open to both approaches. What's the preference of
> > VFS experts?
>
> That would probably be illegal. Using the sg_table in the DMA-buf
> implementation turned out to be a mistake.
Two thing here that should not be directly conflated. Using the
sg_table was a huge mistake, and we should try to move dmabuf to
switch that to a pure dma_addr_t/len array now that the new DMA API
supporting that has been merged. Is there any chance the dma-buf
maintainers could start to kick this off? I'm of course happy to
assist.
But that notwithstanding, dma-buf is THE buffer sharing mechanism in
the kernel, and we should promote it instead of reinventing it badly.
And there is a use case for having a fully DMA mapped buffer in the
block layer and I/O path, especially on systems with an IOMMU.
So having an iov_iter backed by a dma-buf would be extremely helpful.
That's mostly lib/iov_iter.c code, not VFS, though.
> The question Christoph raised was rather why is your CPU so slow
> that walking the page tables has a significant overhead compared to
> the actual I/O?
Yes, that's really puzzling and should be addressed first.
Powered by blists - more mailing lists