[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c8478a67-b14f-485c-a239-8967f1e40600@vivo.com>
Date: Mon, 15 Jul 2024 17:07:37 +0800
From: Lei Liu <liulei.rjpt@...o.com>
To: Christian König <christian.koenig@....com>,
"T.J. Mercier" <tjmercier@...gle.com>
Cc: Sumit Semwal <sumit.semwal@...aro.org>,
Benjamin Gaignard <benjamin.gaignard@...labora.com>,
Brian Starkey <Brian.Starkey@....com>, John Stultz <jstultz@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>, Matthew Wilcox <willy@...radead.org>,
Muhammad Usama Anjum <usama.anjum@...labora.com>,
Andrei Vagin <avagin@...gle.com>, Ryan Roberts <ryan.roberts@....com>,
Kefeng Wang <wangkefeng.wang@...wei.com>, linux-media@...r.kernel.org,
dri-devel@...ts.freedesktop.org, linaro-mm-sig@...ts.linaro.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, Daniel Vetter <daniel@...ll.ch>,
"Vetter, Daniel" <daniel.vetter@...el.com>, opensource.kernel@...o.com,
quic_sukadev@...cinc.com, quic_cgoldswo@...cinc.com,
Akilesh Kailash <akailash@...gle.com>
Subject: Re: [PATCH 0/2] Support direct I/O read and write for memory
allocated by dmabuf
On 2024/7/11 22:25, Christian König wrote:
> Am 10.07.24 um 18:34 schrieb T.J. Mercier:
>> On Wed, Jul 10, 2024 at 8:08 AM Lei Liu <liulei.rjpt@...o.com> wrote:
>>> on 2024/7/10 22:48, Christian König wrote:
>>>> Am 10.07.24 um 16:35 schrieb Lei Liu:
>>>>> on 2024/7/10 22:14, Christian König wrote:
>>>>>> Am 10.07.24 um 15:57 schrieb Lei Liu:
>>>>>>> Use vm_insert_page to establish a mapping for the memory allocated
>>>>>>> by dmabuf, thus supporting direct I/O read and write; and fix the
>>>>>>> issue of incorrect memory statistics after mapping dmabuf memory.
>>>>>> Well big NAK to that! Direct I/O is intentionally disabled on
>>>>>> DMA-bufs.
>>>>> Hello! Could you explain why direct_io is disabled on DMABUF? Is
>>>>> there any historical reason for this?
>>>> It's basically one of the most fundamental design decision of DMA-Buf.
>>>> The attachment/map/fence model DMA-buf uses is not really compatible
>>>> with direct I/O on the underlying pages.
>>> Thank you! Is there any related documentation on this? I would like to
>>> understand and learn more about the fundamental reasons for the lack of
>>> support.
>> Hi Lei and Christian,
>>
>> This is now the third request I've seen from three different companies
>> who are interested in this,
>
> Yeah, completely agree. This is a re-occurring pattern :)
>
> Maybe we should document the preferred solution for that.
>
>> but the others are not for reasons of read
>> performance that you mention in the commit message on your first
>> patch. Someone else at Google ran a comparison between a normal read()
>> and a direct I/O read() into a preallocated user buffer and found that
>> with large readahead (16 MB) the throughput can actually be slightly
>> higher than direct I/O. If you have concerns about read performance,
>> have you tried increasing the readahead size?
>>
>> The other motivation is to load a gajillion byte file from disk into a
>> dmabuf without evicting the entire contents of pagecache while doing
>> so. Something like this (which does not currently work because read()
>> tries to GUP on the dmabuf memory as you mention):
>>
>> static int dmabuf_heap_alloc(int heap_fd, size_t len)
>> {
>> struct dma_heap_allocation_data data = {
>> .len = len,
>> .fd = 0,
>> .fd_flags = O_RDWR | O_CLOEXEC,
>> .heap_flags = 0,
>> };
>> int ret = ioctl(heap_fd, DMA_HEAP_IOCTL_ALLOC, &data);
>> if (ret < 0)
>> return ret;
>> return data.fd;
>> }
>>
>> int main(int, char **argv)
>> {
>> const char *file_path = argv[1];
>> printf("File: %s\n", file_path);
>> int file_fd = open(file_path, O_RDONLY | O_DIRECT);
>>
>> struct stat st;
>> stat(file_path, &st);
>> ssize_t file_size = st.st_size;
>> ssize_t aligned_size = (file_size + 4095) & ~4095;
>>
>> printf("File size: %zd Aligned size: %zd\n", file_size,
>> aligned_size);
>> int heap_fd = open("/dev/dma_heap/system", O_RDONLY);
>> int dmabuf_fd = dmabuf_heap_alloc(heap_fd, aligned_size);
>>
>> void *vm = mmap(nullptr, aligned_size, PROT_READ | PROT_WRITE,
>> MAP_SHARED, dmabuf_fd, 0);
>> printf("VM at 0x%lx\n", (unsigned long)vm);
>>
>> dma_buf_sync sync_flags { DMA_BUF_SYNC_START |
>> DMA_BUF_SYNC_READ | DMA_BUF_SYNC_WRITE };
>> ioctl(dmabuf_fd, DMA_BUF_IOCTL_SYNC, &sync_flags);
>>
>> ssize_t rc = read(file_fd, vm, file_size);
>> printf("Read: %zd %s\n", rc, rc < 0 ? strerror(errno) : "");
>>
>> sync_flags.flags = DMA_BUF_SYNC_END | DMA_BUF_SYNC_READ |
>> DMA_BUF_SYNC_WRITE;
>> ioctl(dmabuf_fd, DMA_BUF_IOCTL_SYNC, &sync_flags);
>> }
>>
>> Or replace the mmap() + read() with sendfile().
>
> Or copy_file_range(). That's pretty much exactly what I suggested on
> the other mail thread around that topic as well.
Thank you for your suggestion. I will study the method you suggested
with Yang. Using copy_file_range() might be a good solution approach.
Regards,
Lei Liu.
>
>> So I would also like to see the above code (or something else similar)
>> be able to work and I understand some of the reasons why it currently
>> does not, but I don't understand why we should actively prevent this
>> type of behavior entirely.
>
> +1
>
> Regards,
> Christian.
>
>>
>> Best,
>> T.J.
>>
>>
>>
>>
>>
>>
>>
>>
>>>>>> We already discussed enforcing that in the DMA-buf framework and
>>>>>> this patch probably means that we should really do that.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>> Thank you for your response. With the application of AI large model
>>>>> edgeification, we urgently need support for direct_io on DMABUF to
>>>>> read some very large files. Do you have any new solutions or plans
>>>>> for this?
>>>> We have seen similar projects over the years and all of those turned
>>>> out to be complete shipwrecks.
>>>>
>>>> There is currently a patch set under discussion to give the network
>>>> subsystem DMA-buf support. If you are interest in network direct I/O
>>>> that could help.
>>> Is there a related introduction link for this patch?
>>>
>>>> Additional to that a lot of GPU drivers support userptr usages, e.g.
>>>> to import malloced memory into the GPU driver. You can then also do
>>>> direct I/O on that malloced memory and the kernel will enforce correct
>>>> handling with the GPU driver through MMU notifiers.
>>>>
>>>> But as far as I know a general DMA-buf based solution isn't possible.
>>> 1.The reason we need to use DMABUF memory here is that we need to share
>>> memory between the CPU and APU. Currently, only DMABUF memory is
>>> suitable for this purpose. Additionally, we need to read very large
>>> files.
>>>
>>> 2. Are there any other solutions for this? Also, do you have any plans
>>> to support direct_io for DMABUF memory in the future?
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Regards,
>>>>> Lei Liu.
>>>>>
>>>>>>> Lei Liu (2):
>>>>>>> mm: dmabuf_direct_io: Support direct_io for memory allocated by
>>>>>>> dmabuf
>>>>>>> mm: dmabuf_direct_io: Fix memory statistics error for dmabuf
>>>>>>> allocated
>>>>>>> memory with direct_io support
>>>>>>>
>>>>>>> drivers/dma-buf/heaps/system_heap.c | 5 +++--
>>>>>>> fs/proc/task_mmu.c | 8 +++++++-
>>>>>>> include/linux/mm.h | 1 +
>>>>>>> mm/memory.c | 15 ++++++++++-----
>>>>>>> mm/rmap.c | 9 +++++----
>>>>>>> 5 files changed, 26 insertions(+), 12 deletions(-)
>>>>>>>
>
Powered by blists - more mailing lists