[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5c11b50c-2e36-4fd5-943c-086f55adffa8@amd.com>
Date: Fri, 16 May 2025 10:36:11 +0200
From: Christian König <christian.koenig@....com>
To: wangtao <tao.wangtao@...or.com>,
"sumit.semwal@...aro.org" <sumit.semwal@...aro.org>,
"benjamin.gaignard@...labora.com" <benjamin.gaignard@...labora.com>,
"Brian.Starkey@....com" <Brian.Starkey@....com>,
"jstultz@...gle.com" <jstultz@...gle.com>,
"tjmercier@...gle.com" <tjmercier@...gle.com>
Cc: "linux-media@...r.kernel.org" <linux-media@...r.kernel.org>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
"linaro-mm-sig@...ts.linaro.org" <linaro-mm-sig@...ts.linaro.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"wangbintian(BintianWang)" <bintian.wang@...or.com>,
yipengxiang <yipengxiang@...or.com>, liulu 00013167 <liulu.liu@...or.com>,
hanfeng 00012985 <feng.han@...or.com>
Subject: Re: [PATCH 2/2] dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for
system_heap
On 5/16/25 09:40, wangtao wrote:
>
>
>> -----Original Message-----
>> From: Christian König <christian.koenig@....com>
>> Sent: Thursday, May 15, 2025 10:26 PM
>> To: wangtao <tao.wangtao@...or.com>; sumit.semwal@...aro.org;
>> benjamin.gaignard@...labora.com; Brian.Starkey@....com;
>> jstultz@...gle.com; tjmercier@...gle.com
>> Cc: linux-media@...r.kernel.org; dri-devel@...ts.freedesktop.org; linaro-
>> mm-sig@...ts.linaro.org; linux-kernel@...r.kernel.org;
>> wangbintian(BintianWang) <bintian.wang@...or.com>; yipengxiang
>> <yipengxiang@...or.com>; liulu 00013167 <liulu.liu@...or.com>; hanfeng
>> 00012985 <feng.han@...or.com>
>> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement
>> DMA_BUF_IOCTL_RW_FILE for system_heap
>>
>> On 5/15/25 16:03, wangtao wrote:
>>> [wangtao] My Test Configuration (CPU 1GHz, 5-test average):
>>> Allocation: 32x32MB buffer creation
>>> - dmabuf 53ms vs. udmabuf 694ms (10X slower)
>>> - Note: shmem shows excessive allocation time
>>
>> Yeah, that is something already noted by others as well. But that is
>> orthogonal.
>>
>>>
>>> Read 1024MB File:
>>> - dmabuf direct 326ms vs. udmabuf direct 461ms (40% slower)
>>> - Note: pin_user_pages_fast consumes majority CPU cycles
>>>
>>> Key function call timing: See details below.
>>
>> Those aren't valid, you are comparing different functionalities here.
>>
>> Please try using udmabuf with sendfile() as confirmed to be working by T.J.
> [wangtao] Using buffer IO with dmabuf file read/write requires one memory copy.
> Direct IO removes this copy to enable zero-copy. The sendfile system call
> reduces memory copies from two (read/write) to one. However, with udmabuf,
> sendfile still keeps at least one copy, failing zero-copy.
Then please work on fixing this.
Regards,
Christian.
>
> If udmabuf sendfile uses buffer IO (file page cache), read latency matches
> dmabuf buffer read, but allocation time is much longer.
> With Direct IO, the default 16-page pipe size makes it slower than buffer IO.
>
> Test data shows:
> udmabuf direct read is much faster than udmabuf sendfile.
> dmabuf direct read outperforms udmabuf direct read by a large margin.
>
> Issue: After udmabuf is mapped via map_dma_buf, apps using memfd or
> udmabuf for Direct IO might cause errors, but there are no safeguards to
> prevent this.
>
> Allocate 32x32MB buffer and read 1024 MB file Test:
> Metric | alloc (ms) | read (ms) | total (ms)
> -----------------------|------------|-----------|-----------
> udmabuf buffer read | 539 | 2017 | 2555
> udmabuf direct read | 522 | 658 | 1179
> udmabuf buffer sendfile| 505 | 1040 | 1546
> udmabuf direct sendfile| 510 | 2269 | 2780
> dmabuf buffer read | 51 | 1068 | 1118
> dmabuf direct read | 52 | 297 | 349
>
> udmabuf sendfile test steps:
> 1. Open data file(1024MB), get back_fd
> 2. Create memfd(32MB) # Loop steps 2-6
> 3. Allocate udmabuf with memfd
> 4. Call sendfile(memfd, back_fd)
> 5. Close memfd after sendfile
> 6. Close udmabuf
> 7. Close back_fd
>
>>
>> Regards,
>> Christian.
>
Powered by blists - more mailing lists