[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEg1BZj-HzbgWKsx@infradead.org>
Date: Tue, 10 Jun 2025 06:37:09 -0700
From: Christoph Hellwig <hch@...radead.org>
To: Christian König <christian.koenig@....com>
Cc: wangtao <tao.wangtao@...or.com>, Christoph Hellwig <hch@...radead.org>,
"sumit.semwal@...aro.org" <sumit.semwal@...aro.org>,
"kraxel@...hat.com" <kraxel@...hat.com>,
"vivek.kasireddy@...el.com" <vivek.kasireddy@...el.com>,
"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
"brauner@...nel.org" <brauner@...nel.org>,
"hughd@...gle.com" <hughd@...gle.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"amir73il@...il.com" <amir73il@...il.com>,
"benjamin.gaignard@...labora.com" <benjamin.gaignard@...labora.com>,
"Brian.Starkey@....com" <Brian.Starkey@....com>,
"jstultz@...gle.com" <jstultz@...gle.com>,
"tjmercier@...gle.com" <tjmercier@...gle.com>,
"jack@...e.cz" <jack@...e.cz>,
"baolin.wang@...ux.alibaba.com" <baolin.wang@...ux.alibaba.com>,
"linux-media@...r.kernel.org" <linux-media@...r.kernel.org>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
"linaro-mm-sig@...ts.linaro.org" <linaro-mm-sig@...ts.linaro.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"wangbintian(BintianWang)" <bintian.wang@...or.com>,
yipengxiang <yipengxiang@...or.com>,
liulu 00013167 <liulu.liu@...or.com>,
hanfeng 00012985 <feng.han@...or.com>
Subject: Re: [PATCH v4 0/4] Implement dmabuf direct I/O via copy_file_range
On Tue, Jun 10, 2025 at 12:52:18PM +0200, Christian König wrote:
> >> dma_addr_t/len array now that the new DMA API supporting that has been
> >> merged. Is there any chance the dma-buf maintainers could start to kick this
> >> off? I'm of course happy to assist.
>
> Work on that is already underway for some time.
>
> Most GPU drivers already do sg_table -> DMA array conversion, I need
> to push on the remaining to clean up.
Do you have a pointer?
> >> Yes, that's really puzzling and should be addressed first.
> > With high CPU performance (e.g., 3GHz), GUP (get_user_pages) overhead
> > is relatively low (observed in 3GHz tests).
>
> Even on a low end CPU walking the page tables and grabbing references
> shouldn't be that much of an overhead.
Yes.
>
> There must be some reason why you see so much CPU overhead. E.g.
> compound pages are broken up or similar which should not happen in
> the first place.
pin_user_pages outputs an array of PAGE_SIZE (modulo offset and shorter
last length) array strut pages unfortunately. The block direct I/O
code has grown code to reassemble folios from them fairly recently
which did speed up some workloads.
Is this test using the block device or iomap direct I/O code? What
kernel version is it run on?
Powered by blists - more mailing lists