lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4a53b6bf-9273-4e77-9882-644faafa200a@amd.com>
Date: Thu, 22 May 2025 13:57:45 +0200
From: Christian König <christian.koenig@....com>
To: wangtao <tao.wangtao@...or.com>, "T.J. Mercier" <tjmercier@...gle.com>
Cc: "sumit.semwal@...aro.org" <sumit.semwal@...aro.org>,
 "benjamin.gaignard@...labora.com" <benjamin.gaignard@...labora.com>,
 "Brian.Starkey@....com" <Brian.Starkey@....com>,
 "jstultz@...gle.com" <jstultz@...gle.com>,
 "linux-media@...r.kernel.org" <linux-media@...r.kernel.org>,
 "dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
 "linaro-mm-sig@...ts.linaro.org" <linaro-mm-sig@...ts.linaro.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "wangbintian(BintianWang)" <bintian.wang@...or.com>,
 yipengxiang <yipengxiang@...or.com>, liulu 00013167 <liulu.liu@...or.com>,
 hanfeng 00012985 <feng.han@...or.com>,
 "amir73il@...il.com" <amir73il@...il.com>
Subject: Re: [PATCH 2/2] dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for
 system_heap

On 5/22/25 10:02, wangtao wrote:
>> -----Original Message-----
>> From: Christian König <christian.koenig@....com>
>> Sent: Wednesday, May 21, 2025 7:57 PM
>> To: wangtao <tao.wangtao@...or.com>; T.J. Mercier
>> <tjmercier@...gle.com>
>> Cc: sumit.semwal@...aro.org; benjamin.gaignard@...labora.com;
>> Brian.Starkey@....com; jstultz@...gle.com; linux-media@...r.kernel.org;
>> dri-devel@...ts.freedesktop.org; linaro-mm-sig@...ts.linaro.org; linux-
>> kernel@...r.kernel.org; wangbintian(BintianWang)
>> <bintian.wang@...or.com>; yipengxiang <yipengxiang@...or.com>; liulu
>> 00013167 <liulu.liu@...or.com>; hanfeng 00012985 <feng.han@...or.com>;
>> amir73il@...il.com
>> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement
>> DMA_BUF_IOCTL_RW_FILE for system_heap
>>
>> On 5/21/25 12:25, wangtao wrote:
>>> [wangtao] I previously explained that
>>> read/sendfile/splice/copy_file_range
>>> syscalls can't achieve dmabuf direct IO zero-copy.
>>
>> And why can't you work on improving those syscalls instead of creating a new
>> IOCTL?
>>
> [wangtao] As I mentioned in previous emails, these syscalls cannot
> achieve dmabuf zero-copy due to technical constraints.

Yeah, and why can't you work on removing those technical constrains?

What is blocking you from improving the sendfile system call or proposing a patch to remove the copy_file_range restrictions?

Regards,
Christian.

 Could you
> specify the technical points, code, or principles that need
> optimization? 
> 
> Let me explain again why these syscalls can't work:
> 1. read() syscall
>    - dmabuf fops lacks read callback implementation. Even if implemented,
>      file_fd info cannot be transferred
>    - read(file_fd, dmabuf_ptr, len) with remap_pfn_range-based mmap
>      cannot access dmabuf_buf pages, forcing buffer-mode reads
> 
> 2. sendfile() syscall
>    - Requires CPU copy from page cache to memory file(tmpfs/shmem):
>      [DISK] --DMA--> [page cache] --CPU copy--> [MEMORY file]
>    - CPU overhead (both buffer/direct modes involve copies):
>      55.08% do_sendfile
>     |- 55.08% do_splice_direct
>     |-|- 55.08% splice_direct_to_actor
>     |-|-|- 22.51% copy_splice_read
>     |-|-|-|- 16.57% f2fs_file_read_iter
>     |-|-|-|-|- 15.12% __iomap_dio_rw
>     |-|-|- 32.33% direct_splice_actor
>     |-|-|-|- 32.11% iter_file_splice_write
>     |-|-|-|-|- 28.42% vfs_iter_write
>     |-|-|-|-|-|- 28.42% do_iter_write
>     |-|-|-|-|-|-|- 28.39% shmem_file_write_iter
>     |-|-|-|-|-|-|-|- 24.62% generic_perform_write
>     |-|-|-|-|-|-|-|-|- 18.75% __pi_memmove
> 
> 3. splice() requires one end to be a pipe, incompatible with regular files or dmabuf.
> 
> 4. copy_file_range()
>    - Blocked by cross-FS restrictions (Amir's commit 868f9f2f8e00)
>    - Even without this restriction, Even without restrictions, implementing
>      the copy_file_range callback in dmabuf fops would only allow dmabuf read
> 	 from regular files. This is because copy_file_range relies on
> 	 file_out->f_op->copy_file_range, which cannot support dmabuf write
> 	 operations to regular files.
> 
> Test results confirm these limitations:
> T.J. Mercier's 1G from ext4 on 6.12.20 | read/sendfile (ms) w/ 3 > drop_caches
> ------------------------|-------------------
> udmabuf buffer read     | 1210
> udmabuf direct read     | 671
> udmabuf buffer sendfile | 1096
> udmabuf direct sendfile | 2340
> 
> My 3GHz CPU tests (cache cleared):
> Method                | alloc | read  | vs. (%)
> -----------------------------------------------
> udmabuf buffer read   | 135   | 546   | 180%
> udmabuf direct read   | 159   | 300   | 99%
> udmabuf buffer sendfile | 134 | 303   | 100%
> udmabuf direct sendfile | 141 | 912   | 301%
> dmabuf buffer read    | 22    | 362   | 119%
> my patch direct read  | 29    | 265   | 87%
> 
> My 1GHz CPU tests (cache cleared):
> Method                | alloc | read  | vs. (%)
> -----------------------------------------------
> udmabuf buffer read   | 552   | 2067  | 198%
> udmabuf direct read   | 540   | 627   | 60%
> udmabuf buffer sendfile | 497 | 1045  | 100%
> udmabuf direct sendfile | 527 | 2330  | 223%
> dmabuf buffer read    | 40    | 1111  | 106%
> patch direct read     | 44    | 310   | 30%
> 
> Test observations align with expectations:
> 1. dmabuf buffer read requires slow CPU copies
> 2. udmabuf direct read achieves zero-copy but has page retrieval
>    latency from vaddr
> 3. udmabuf buffer sendfile suffers CPU copy overhead
> 4. udmabuf direct sendfile combines CPU copies with frequent DMA
>    operations due to small pipe buffers
> 5. dmabuf buffer read also requires CPU copies
> 6. My direct read patch enables zero-copy with better performance
>    on low-power CPUs
> 7. udmabuf creation time remains problematic (as you’ve noted).
> 
>>> My focus is enabling dmabuf direct I/O for [regular file] <--DMA-->
>>> [dmabuf] zero-copy.
>>
>> Yeah and that focus is wrong. You need to work on a general solution to the
>> issue and not specific to your problem.
>>
>>> Any API achieving this would work. Are there other uAPIs you think
>>> could help? Could you recommend experts who might offer suggestions?
>>
>> Well once more: Either work on sendfile or copy_file_range or eventually
>> splice to make it what you want to do.
>>
>> When that is done we can discuss with the VFS people if that approach is
>> feasible.
>>
>> But just bypassing the VFS review by implementing a DMA-buf specific IOCTL
>> is a NO-GO. That is clearly not something you can do in any way.
> [wangtao] The issue is that only dmabuf lacks Direct I/O zero-copy support. Tmpfs/shmem
> already work with Direct I/O zero-copy. As explained, existing syscalls or
> generic methods can't enable dmabuf direct I/O zero-copy, which is why I
> propose adding an IOCTL command.
> 
> I respect your perspective. Could you clarify specific technical aspects,
> code requirements, or implementation principles for modifying sendfile()
> or copy_file_range()? This would help advance our discussion.
> 
> Thank you for engaging in this dialogue.
> 
>>
>> Regards,
>> Christian.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ