[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <10607b3c-03e7-4a78-ad43-05e11408ef00@vivo.com>
Date: Fri, 12 Jul 2024 19:12:02 +0800
From: Huan Yang <link@...o.com>
To: Christian König <christian.koenig@....com>,
Sumit Semwal <sumit.semwal@...aro.org>,
Benjamin Gaignard <benjamin.gaignard@...labora.com>,
Brian Starkey <Brian.Starkey@....com>, John Stultz <jstultz@...gle.com>,
"T.J. Mercier" <tjmercier@...gle.com>, linux-media@...r.kernel.org,
dri-devel@...ts.freedesktop.org, linaro-mm-sig@...ts.linaro.org,
linux-kernel@...r.kernel.org
Cc: opensource.kernel@...o.com
Subject: Re: [PATCH 1/2] dma-buf: heaps: DMA_HEAP_IOCTL_ALLOC_READ_FILE
framework
在 2024/7/12 18:59, Christian König 写道:
> Am 12.07.24 um 09:52 schrieb Huan Yang:
>>
>> 在 2024/7/12 15:41, Christian König 写道:
>>> Am 12.07.24 um 09:29 schrieb Huan Yang:
>>>> Hi Christian,
>>>>
>>>> 在 2024/7/12 15:10, Christian König 写道:
>>>>> Am 12.07.24 um 04:14 schrieb Huan Yang:
>>>>>> 在 2024/7/12 9:59, Huan Yang 写道:
>>>>>>> Hi Christian,
>>>>>>>
>>>>>>> 在 2024/7/11 19:39, Christian König 写道:
>>>>>>>> Am 11.07.24 um 11:18 schrieb Huan Yang:
>>>>>>>>> Hi Christian,
>>>>>>>>>
>>>>>>>>> Thanks for your reply.
>>>>>>>>>
>>>>>>>>> 在 2024/7/11 17:00, Christian König 写道:
>>>>>>>>>> Am 11.07.24 um 09:42 schrieb Huan Yang:
>>>>>>>>>>> Some user may need load file into dma-buf, current
>>>>>>>>>>> way is:
>>>>>>>>>>> 1. allocate a dma-buf, get dma-buf fd
>>>>>>>>>>> 2. mmap dma-buf fd into vaddr
>>>>>>>>>>> 3. read(file_fd, vaddr, fsz)
>>>>>>>>>>> This is too heavy if fsz reached to GB.
>>>>>>>>>>
>>>>>>>>>> You need to describe a bit more why that is to heavy. I can
>>>>>>>>>> only assume you need to save memory bandwidth and avoid the
>>>>>>>>>> extra copy with the CPU.
>>>>>>>>>
>>>>>>>>> Sorry for the oversimplified explanation. But, yes, you're
>>>>>>>>> right, we want to avoid this.
>>>>>>>>>
>>>>>>>>> As we are dealing with embedded devices, the available memory
>>>>>>>>> and computing power for users are usually limited.(The maximum
>>>>>>>>> available memory is currently
>>>>>>>>>
>>>>>>>>> 24GB, typically ranging from 8-12GB. )
>>>>>>>>>
>>>>>>>>> Also, the CPU computing power is also usually in short supply,
>>>>>>>>> due to limited battery capacity and limited heat dissipation
>>>>>>>>> capabilities.
>>>>>>>>>
>>>>>>>>> So, we hope to avoid ineffective paths as much as possible.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> This patch implement a feature called
>>>>>>>>>>> DMA_HEAP_IOCTL_ALLOC_READ_FILE.
>>>>>>>>>>> User need to offer a file_fd which you want to load into
>>>>>>>>>>> dma-buf, then,
>>>>>>>>>>> it promise if you got a dma-buf fd, it will contains the
>>>>>>>>>>> file content.
>>>>>>>>>>
>>>>>>>>>> Interesting idea, that has at least more potential than
>>>>>>>>>> trying to enable direct I/O on mmap()ed DMA-bufs.
>>>>>>>>>>
>>>>>>>>>> The approach with the new IOCTL might not work because it is
>>>>>>>>>> a very specialized use case.
>>>>>>>>>
>>>>>>>>> Thank you for your advice. maybe the "read file" behavior can
>>>>>>>>> be attached to an existing allocation?
>>>>>>>>
>>>>>>>> The point is there are already system calls to do something
>>>>>>>> like that.
>>>>>>>>
>>>>>>>> See copy_file_range()
>>>>>>>> (https://man7.org/linux/man-pages/man2/copy_file_range.2.html)
>>>>>>>> and send_file()
>>>>>>>> (https://man7.org/linux/man-pages/man2/sendfile.2.html).
>>>>>>>
>>>>>>> That's helpfull to learn it, thanks.
>>>>>>>
>>>>>>> In terms of only DMA-BUF supporting direct I/O,
>>>>>>> copy_file_range/send_file may help to achieve this functionality.
>>>>>>>
>>>>>>> However, my patchset also aims to achieve parallel copying of
>>>>>>> file contents while allocating the DMA-BUF, which is something
>>>>>>> that the current set of calls may not be able to accomplish.
>>>>>
>>>>> And exactly that is a no-go. Use the existing IOCTLs and system
>>>>> calls instead they should have similar performance when done right.
>>>>
>>>> Get it, but In my testing process, even without memory pressure, it
>>>> takes about 60ms to allocate a 3GB DMA-BUF. When there is
>>>> significant memory pressure, the allocation time for a 3GB
>>>
>>> Well exactly that doesn't make sense. Even if you read the content
>>> of the DMA-buf from a file you still need to allocate it first.
>>
>> Yes, need allocate first, but in kernelspace, no need to wait all
>> memory allocated done and then trigger file load.
>
> That doesn't really make sense. Allocating a large bunch of memory is
> more efficient than allocating less multiple times because of cache
> locality for example.
No, this patchset not change `the alloc behavior`, heap can goon alloc,
but we will in a second it meet batch, then map the batch page(it
alloced) into vmalloc area, then trigger IO.
>
> You could of course hide latency caused by operations to reduce memory
> pressure when you have a specific use case, but you don't need to use
> an in kernel implementation for that.
>
> Question is do you have clear on allocation or clear on free enabled?
We have a free clear, so, alloc and load file is OK.
>
>> This patchset use `batch` to done(default 128MB), ever 128MB
>> allocated, vmap and get vaddr, then trigger this vaddr load file's
>> target pos content.
>
> Again that sounds really not ideal to me. Creating the vmap alone is
> complete unnecessary overhead.
Hmmm, maybe you can give a try, I offered the test program also in
cover-letter?
>
>>> So the question is why should reading and allocating it at the same
>>> time be better in any way?
>>
>> Memory pressure will trigger reclaim, it must to wait.(ms) Asume I
>> already allocated 512MB(need 3G) without enter slowpath,
>>
>> Even I need to enter slowpath to allocated remain memory, the already
>> allocated memory is using load file content.(Save time compare to
>> allocated done and read)
>>
>> The time difference between them can be expressed by the formula:
>>
>> 1. Allocate dmabuf time + file load time -- for original
>>
>> 2. first prepare batch time + Max(file load time, allocate remain
>> dma-buf time) + latest batch prepare time -- for new
>>
>> When the file reaches the gigabyte level, the significant difference
>> between the two can be clearly observed.
>
> I have strong doubts about that. The method you describe above is
> actually really inefficient.
Also, maybe you can test? dd a large file, then compare?
All of it I test in my phone and archlinux PC both show some improve.
>
> First of all you create a memory mapping just to load data, that is
> superfluous and TLB flushes are usually extremely costly. Both for
> userspace as well as kernel.
>
> I strongly suggest to try to use copy_file_range() instead. But could
> be that copy_file_range() doesn't even work right now because of some
> restrictions, never tried that on a DMA-buf.
I agree, I'm start this research.
>
> When that works as far as I can see what could still be saved on
> overhead is the following:
>
> 1. Clearing of memory on allocation. That could potentially be done
> with delayed allocation or clear on free instead.
>
> 2. CPU copy between the I/O target buffer and the DMA-buf backing
> pages. In theory it should be possible to avoid that by implementing
> the copy_file_range() callback, but I'm not 100% sure.
All you mentioned above is make sense. :)
>
> Regards,
> Christian.
>
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>>
>>>> DMA-BUF can increase to 300ms-1s. (The above test times can also
>>>> demonstrate the difference.)
>>>>
>>>> But, talk is cheap, I agree to research use existing way to
>>>> implements it and give a test.
>>>>
>>>> I'll show this if I done .
>>>>
>>>> Thanks for your suggestions.
>>>>
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>>
>>>>>> You can see cover-letter, here are the normal test and this
>>>>>> IOCTL's compare in memory pressure, even if buffered I/O in this
>>>>>> ioctl can have 50% improve by parallel.
>>>>>>
>>>>>> dd a 3GB file for test, 12G RAM phone, UFS4.0, stressapptest 4G
>>>>>> memory pressure.
>>>>>>
>>>>>> 1. original
>>>>>> ```shel
>>>>>> # create a model file
>>>>>> dd if=/dev/zero of=./model.txt bs=1M count=3072
>>>>>> # drop page cache
>>>>>> echo 3 > /proc/sys/vm/drop_caches
>>>>>> ./dmabuf-heap-file-read mtk_mm-uncached normal
>>>>>>
>>>>>>> result is total cost 13087213847ns
>>>>>>
>>>>>> ```
>>>>>>
>>>>>> 2.DMA_HEAP_IOCTL_ALLOC_AND_READ O_DIRECT
>>>>>> ```shel
>>>>>> # create a model file
>>>>>> dd if=/dev/zero of=./model.txt bs=1M count=3072
>>>>>> # drop page cache
>>>>>> echo 3 > /proc/sys/vm/drop_caches
>>>>>> ./dmabuf-heap-file-read mtk_mm-uncached direct_io
>>>>>>
>>>>>>> result is total cost 2902386846ns
>>>>>>
>>>>>> # use direct_io_check can check the content if is same to file.
>>>>>> ```
>>>>>>
>>>>>> 3. DMA_HEAP_IOCTL_ALLOC_AND_READ BUFFER I/O
>>>>>> ```shel
>>>>>> # create a model file
>>>>>> dd if=/dev/zero of=./model.txt bs=1M count=3072
>>>>>> # drop page cache
>>>>>> echo 3 > /proc/sys/vm/drop_caches
>>>>>> ./dmabuf-heap-file-read mtk_mm-uncached normal_io
>>>>>>
>>>>>>> result is total cost 5735579385ns
>>>>>>
>>>>>> ```
>>>>>>
>>>>>>>
>>>>>>> Perhaps simply returning the DMA-BUF file descriptor and then
>>>>>>> implementing copy_file_range, while populating the memory and
>>>>>>> content during the copy process, could achieve this? At present,
>>>>>>> it seems that it will be quite complex - We need to ensure that
>>>>>>> only the returned DMA-BUF file descriptor will fail in case of
>>>>>>> memory not fill, like mmap, vmap, attach, and so on.
>>>>>>>
>>>>>>>>
>>>>>>>> What we probably could do is to internally optimize those.
>>>>>>>>
>>>>>>>>> I am currently creating a new ioctl to remind the user that
>>>>>>>>> memory is being allocated and read, and I am also unsure
>>>>>>>>>
>>>>>>>>> whether it is appropriate to add additional parameters to the
>>>>>>>>> existing allocate behavior.
>>>>>>>>>
>>>>>>>>> Please, give me more suggestion. Thanks.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> But IIRC there was a copy_file_range callback in the
>>>>>>>>>> file_operations structure you could use for that. I'm just
>>>>>>>>>> not sure when and how that's used with the copy_file_range()
>>>>>>>>>> system call.
>>>>>>>>>
>>>>>>>>> Sorry, I'm not familiar with this, but I will look into it.
>>>>>>>>> However, this type of callback function is not currently
>>>>>>>>> implemented when exporting
>>>>>>>>>
>>>>>>>>> the dma_buf file, which means that I need to implement the
>>>>>>>>> callback for it?
>>>>>>>>
>>>>>>>> If I'm not completely mistaken the copy_file_range, splice_read
>>>>>>>> and splice_write callbacks on the struct file_operations
>>>>>>>> (https://elixir.bootlin.com/linux/v6.10-rc7/source/include/linux/fs.h#L1999).
>>>>>>>>
>>>>>>>> Can be used to implement what you want to do.
>>>>>>> Yes.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Notice, file_fd depends on user how to open this file. So,
>>>>>>>>>>> both buffer
>>>>>>>>>>> I/O and Direct I/O is supported.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Huan Yang <link@...o.com>
>>>>>>>>>>> ---
>>>>>>>>>>> drivers/dma-buf/dma-heap.c | 525
>>>>>>>>>>> +++++++++++++++++++++++++++++++++-
>>>>>>>>>>> include/linux/dma-heap.h | 57 +++-
>>>>>>>>>>> include/uapi/linux/dma-heap.h | 32 +++
>>>>>>>>>>> 3 files changed, 611 insertions(+), 3 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/dma-buf/dma-heap.c
>>>>>>>>>>> b/drivers/dma-buf/dma-heap.c
>>>>>>>>>>> index 2298ca5e112e..abe17281adb8 100644
>>>>>>>>>>> --- a/drivers/dma-buf/dma-heap.c
>>>>>>>>>>> +++ b/drivers/dma-buf/dma-heap.c
>>>>>>>>>>> @@ -15,9 +15,11 @@
>>>>>>>>>>> #include <linux/list.h>
>>>>>>>>>>> #include <linux/slab.h>
>>>>>>>>>>> #include <linux/nospec.h>
>>>>>>>>>>> +#include <linux/highmem.h>
>>>>>>>>>>> #include <linux/uaccess.h>
>>>>>>>>>>> #include <linux/syscalls.h>
>>>>>>>>>>> #include <linux/dma-heap.h>
>>>>>>>>>>> +#include <linux/vmalloc.h>
>>>>>>>>>>> #include <uapi/linux/dma-heap.h>
>>>>>>>>>>> #define DEVNAME "dma_heap"
>>>>>>>>>>> @@ -43,12 +45,462 @@ struct dma_heap {
>>>>>>>>>>> struct cdev heap_cdev;
>>>>>>>>>>> };
>>>>>>>>>>> +/**
>>>>>>>>>>> + * struct dma_heap_file - wrap the file, read task for
>>>>>>>>>>> dma_heap allocate use.
>>>>>>>>>>> + * @file: file to read from.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @cred: kthread use, user cred copy to use for the
>>>>>>>>>>> read.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @max_batch: maximum batch size to read, if
>>>>>>>>>>> collect match batch,
>>>>>>>>>>> + * trigger read, default 128MB, must below file
>>>>>>>>>>> size.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @fsz: file size.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @direct: use direct IO?
>>>>>>>>>>> + */
>>>>>>>>>>> +struct dma_heap_file {
>>>>>>>>>>> + struct file *file;
>>>>>>>>>>> + struct cred *cred;
>>>>>>>>>>> + size_t max_batch;
>>>>>>>>>>> + size_t fsz;
>>>>>>>>>>> + bool direct;
>>>>>>>>>>> +};
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * struct dma_heap_file_work - represents a dma_heap file
>>>>>>>>>>> read real work.
>>>>>>>>>>> + * @vaddr: contigous virtual address alloc by vmap,
>>>>>>>>>>> file read need.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @start_size: file read start offset, same to
>>>>>>>>>>> @dma_heap_file_task->roffset.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @need_size: file read need size, same to
>>>>>>>>>>> @dma_heap_file_task->rsize.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @heap_file: file wrapper.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @list: child node of @dma_heap_file_control->works.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @refp: same @dma_heap_file_task->ref, if end of
>>>>>>>>>>> read, put ref.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @failp: if any work io failed, set it true,
>>>>>>>>>>> pointp @dma_heap_file_task->fail.
>>>>>>>>>>> + */
>>>>>>>>>>> +struct dma_heap_file_work {
>>>>>>>>>>> + void *vaddr;
>>>>>>>>>>> + ssize_t start_size;
>>>>>>>>>>> + ssize_t need_size;
>>>>>>>>>>> + struct dma_heap_file *heap_file;
>>>>>>>>>>> + struct list_head list;
>>>>>>>>>>> + atomic_t *refp;
>>>>>>>>>>> + bool *failp;
>>>>>>>>>>> +};
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * struct dma_heap_file_task - represents a dma_heap file
>>>>>>>>>>> read process
>>>>>>>>>>> + * @ref: current file work counter, if zero,
>>>>>>>>>>> allocate and read
>>>>>>>>>>> + * done.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @roffset: last read offset, current prepared
>>>>>>>>>>> work' begin file
>>>>>>>>>>> + * start offset.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @rsize: current allocated page size use to read,
>>>>>>>>>>> if reach rbatch,
>>>>>>>>>>> + * trigger commit.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @rbatch: current prepared work's batch, below
>>>>>>>>>>> @dma_heap_file's
>>>>>>>>>>> + * batch.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @heap_file: current dma_heap_file
>>>>>>>>>>> + *
>>>>>>>>>>> + * @parray: used for vmap, size is @dma_heap_file's
>>>>>>>>>>> batch's number
>>>>>>>>>>> + * pages.(this is maximum). Due to single thread
>>>>>>>>>>> file read,
>>>>>>>>>>> + * one page array reuse each work prepare is OK.
>>>>>>>>>>> + * Each index in parray is PAGE_SIZE.(vmap need)
>>>>>>>>>>> + *
>>>>>>>>>>> + * @pindex: current allocated page filled in
>>>>>>>>>>> @parray's index.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @fail: any work failed when file read?
>>>>>>>>>>> + *
>>>>>>>>>>> + * dma_heap_file_task is the production of file read, will
>>>>>>>>>>> prepare each work
>>>>>>>>>>> + * during allocate dma_buf pages, if match current batch,
>>>>>>>>>>> then trigger commit
>>>>>>>>>>> + * and prepare next work. After all batch queued, user
>>>>>>>>>>> going on prepare dma_buf
>>>>>>>>>>> + * and so on, but before return dma_buf fd, need to wait
>>>>>>>>>>> file read end and
>>>>>>>>>>> + * check read result.
>>>>>>>>>>> + */
>>>>>>>>>>> +struct dma_heap_file_task {
>>>>>>>>>>> + atomic_t ref;
>>>>>>>>>>> + size_t roffset;
>>>>>>>>>>> + size_t rsize;
>>>>>>>>>>> + size_t rbatch;
>>>>>>>>>>> + struct dma_heap_file *heap_file;
>>>>>>>>>>> + struct page **parray;
>>>>>>>>>>> + unsigned int pindex;
>>>>>>>>>>> + bool fail;
>>>>>>>>>>> +};
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * struct dma_heap_file_control - global control of
>>>>>>>>>>> dma_heap file read.
>>>>>>>>>>> + * @works: @dma_heap_file_work's list head.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @lock: only lock for @works.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @threadwq: wait queue for @work_thread, if commit
>>>>>>>>>>> work, @work_thread
>>>>>>>>>>> + * wakeup and read this work's file contains.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @workwq: used for main thread wait for file read
>>>>>>>>>>> end, if allocation
>>>>>>>>>>> + * end before file read. @dma_heap_file_task ref
>>>>>>>>>>> effect this.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @work_thread: file read kthread. the
>>>>>>>>>>> dma_heap_file_task work's consumer.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @heap_fwork_cachep: @dma_heap_file_work's cachep, it's
>>>>>>>>>>> alloc/free frequently.
>>>>>>>>>>> + *
>>>>>>>>>>> + * @nr_work: global number of how many work committed.
>>>>>>>>>>> + */
>>>>>>>>>>> +struct dma_heap_file_control {
>>>>>>>>>>> + struct list_head works;
>>>>>>>>>>> + spinlock_t lock;
>>>>>>>>>>> + wait_queue_head_t threadwq;
>>>>>>>>>>> + wait_queue_head_t workwq;
>>>>>>>>>>> + struct task_struct *work_thread;
>>>>>>>>>>> + struct kmem_cache *heap_fwork_cachep;
>>>>>>>>>>> + atomic_t nr_work;
>>>>>>>>>>> +};
>>>>>>>>>>> +
>>>>>>>>>>> +static struct dma_heap_file_control *heap_fctl;
>>>>>>>>>>> static LIST_HEAD(heap_list);
>>>>>>>>>>> static DEFINE_MUTEX(heap_list_lock);
>>>>>>>>>>> static dev_t dma_heap_devt;
>>>>>>>>>>> static struct class *dma_heap_class;
>>>>>>>>>>> static DEFINE_XARRAY_ALLOC(dma_heap_minors);
>>>>>>>>>>> +/**
>>>>>>>>>>> + * map_pages_to_vaddr - map each scatter page into
>>>>>>>>>>> contiguous virtual address.
>>>>>>>>>>> + * @heap_ftask: prepared and need to commit's work.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Cached pages need to trigger file read, this function
>>>>>>>>>>> map each scatter page
>>>>>>>>>>> + * into contiguous virtual address, so that file read can
>>>>>>>>>>> easy use.
>>>>>>>>>>> + * Now that we get vaddr page, cached pages can return to
>>>>>>>>>>> original user, so we
>>>>>>>>>>> + * will not effect dma-buf export even if file read not end.
>>>>>>>>>>> + */
>>>>>>>>>>> +static void *map_pages_to_vaddr(struct dma_heap_file_task
>>>>>>>>>>> *heap_ftask)
>>>>>>>>>>> +{
>>>>>>>>>>> + return vmap(heap_ftask->parray, heap_ftask->pindex,
>>>>>>>>>>> VM_MAP,
>>>>>>>>>>> + PAGE_KERNEL);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +bool dma_heap_prepare_file_read(struct dma_heap_file_task
>>>>>>>>>>> *heap_ftask,
>>>>>>>>>>> + struct page *page)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct page **array = heap_ftask->parray;
>>>>>>>>>>> + int index = heap_ftask->pindex;
>>>>>>>>>>> + int num = compound_nr(page), i;
>>>>>>>>>>> + unsigned long sz = page_size(page);
>>>>>>>>>>> +
>>>>>>>>>>> + heap_ftask->rsize += sz;
>>>>>>>>>>> + for (i = 0; i < num; ++i)
>>>>>>>>>>> + array[index++] = &page[i];
>>>>>>>>>>> + heap_ftask->pindex = index;
>>>>>>>>>>> +
>>>>>>>>>>> + return heap_ftask->rsize >= heap_ftask->rbatch;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static struct dma_heap_file_work *
>>>>>>>>>>> +init_file_work(struct dma_heap_file_task *heap_ftask)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct dma_heap_file_work *heap_fwork;
>>>>>>>>>>> + struct dma_heap_file *heap_file = heap_ftask->heap_file;
>>>>>>>>>>> +
>>>>>>>>>>> + if (READ_ONCE(heap_ftask->fail))
>>>>>>>>>>> + return NULL;
>>>>>>>>>>> +
>>>>>>>>>>> + heap_fwork =
>>>>>>>>>>> kmem_cache_alloc(heap_fctl->heap_fwork_cachep, GFP_KERNEL);
>>>>>>>>>>> + if (unlikely(!heap_fwork))
>>>>>>>>>>> + return NULL;
>>>>>>>>>>> +
>>>>>>>>>>> + heap_fwork->vaddr = map_pages_to_vaddr(heap_ftask);
>>>>>>>>>>> + if (unlikely(!heap_fwork->vaddr)) {
>>>>>>>>>>> + kmem_cache_free(heap_fctl->heap_fwork_cachep, heap_fwork);
>>>>>>>>>>> + return NULL;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + heap_fwork->heap_file = heap_file;
>>>>>>>>>>> + heap_fwork->start_size = heap_ftask->roffset;
>>>>>>>>>>> + heap_fwork->need_size = heap_ftask->rsize;
>>>>>>>>>>> + heap_fwork->refp = &heap_ftask->ref;
>>>>>>>>>>> + heap_fwork->failp = &heap_ftask->fail;
>>>>>>>>>>> + atomic_inc(&heap_ftask->ref);
>>>>>>>>>>> + return heap_fwork;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static void destroy_file_work(struct dma_heap_file_work
>>>>>>>>>>> *heap_fwork)
>>>>>>>>>>> +{
>>>>>>>>>>> + vunmap(heap_fwork->vaddr);
>>>>>>>>>>> + atomic_dec(heap_fwork->refp);
>>>>>>>>>>> + wake_up(&heap_fctl->workwq);
>>>>>>>>>>> +
>>>>>>>>>>> + kmem_cache_free(heap_fctl->heap_fwork_cachep, heap_fwork);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +int dma_heap_submit_file_read(struct dma_heap_file_task
>>>>>>>>>>> *heap_ftask)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct dma_heap_file_work *heap_fwork =
>>>>>>>>>>> init_file_work(heap_ftask);
>>>>>>>>>>> + struct page *last = NULL;
>>>>>>>>>>> + struct dma_heap_file *heap_file = heap_ftask->heap_file;
>>>>>>>>>>> + size_t start = heap_ftask->roffset;
>>>>>>>>>>> + struct file *file = heap_file->file;
>>>>>>>>>>> + size_t fsz = heap_file->fsz;
>>>>>>>>>>> +
>>>>>>>>>>> + if (unlikely(!heap_fwork))
>>>>>>>>>>> + return -ENOMEM;
>>>>>>>>>>> +
>>>>>>>>>>> + /**
>>>>>>>>>>> + * If file size is not page aligned, direct io can't
>>>>>>>>>>> process the tail.
>>>>>>>>>>> + * So, if reach to tail, remain the last page use
>>>>>>>>>>> buffer read.
>>>>>>>>>>> + */
>>>>>>>>>>> + if (heap_file->direct && start + heap_ftask->rsize >
>>>>>>>>>>> fsz) {
>>>>>>>>>>> + heap_fwork->need_size -= PAGE_SIZE;
>>>>>>>>>>> + last = heap_ftask->parray[heap_ftask->pindex - 1];
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + spin_lock(&heap_fctl->lock);
>>>>>>>>>>> + list_add_tail(&heap_fwork->list, &heap_fctl->works);
>>>>>>>>>>> + spin_unlock(&heap_fctl->lock);
>>>>>>>>>>> + atomic_inc(&heap_fctl->nr_work);
>>>>>>>>>>> +
>>>>>>>>>>> + wake_up(&heap_fctl->threadwq);
>>>>>>>>>>> +
>>>>>>>>>>> + if (last) {
>>>>>>>>>>> + char *buf, *pathp;
>>>>>>>>>>> + ssize_t err;
>>>>>>>>>>> + void *buffer;
>>>>>>>>>>> +
>>>>>>>>>>> + buf = kmalloc(PATH_MAX, GFP_KERNEL);
>>>>>>>>>>> + if (unlikely(!buf))
>>>>>>>>>>> + return -ENOMEM;
>>>>>>>>>>> +
>>>>>>>>>>> + start = PAGE_ALIGN_DOWN(fsz);
>>>>>>>>>>> +
>>>>>>>>>>> + pathp = file_path(file, buf, PATH_MAX);
>>>>>>>>>>> + if (IS_ERR(pathp)) {
>>>>>>>>>>> + kfree(buf);
>>>>>>>>>>> + return PTR_ERR(pathp);
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + buffer = kmap_local_page(last); // use page's kaddr.
>>>>>>>>>>> + err = kernel_read_file_from_path(pathp, start,
>>>>>>>>>>> &buffer,
>>>>>>>>>>> + fsz - start, &fsz,
>>>>>>>>>>> + READING_POLICY);
>>>>>>>>>>> + kunmap_local(buffer);
>>>>>>>>>>> + kfree(buf);
>>>>>>>>>>> + if (err < 0) {
>>>>>>>>>>> + pr_err("failed to use buffer kernel_read_file
>>>>>>>>>>> %s, err=%ld, [%ld, %ld], f_sz=%ld\n",
>>>>>>>>>>> + pathp, err, start, fsz, fsz);
>>>>>>>>>>> +
>>>>>>>>>>> + return err;
>>>>>>>>>>> + }
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + heap_ftask->roffset += heap_ftask->rsize;
>>>>>>>>>>> + heap_ftask->rsize = 0;
>>>>>>>>>>> + heap_ftask->pindex = 0;
>>>>>>>>>>> + heap_ftask->rbatch = min_t(size_t,
>>>>>>>>>>> + PAGE_ALIGN(fsz) - heap_ftask->roffset,
>>>>>>>>>>> + heap_ftask->rbatch);
>>>>>>>>>>> + return 0;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +bool dma_heap_wait_for_file_read(struct dma_heap_file_task
>>>>>>>>>>> *heap_ftask)
>>>>>>>>>>> +{
>>>>>>>>>>> + wait_event_freezable(heap_fctl->workwq,
>>>>>>>>>>> + atomic_read(&heap_ftask->ref) == 0);
>>>>>>>>>>> + return heap_ftask->fail;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +bool dma_heap_destroy_file_read(struct dma_heap_file_task
>>>>>>>>>>> *heap_ftask)
>>>>>>>>>>> +{
>>>>>>>>>>> + bool fail;
>>>>>>>>>>> +
>>>>>>>>>>> + dma_heap_wait_for_file_read(heap_ftask);
>>>>>>>>>>> + fail = heap_ftask->fail;
>>>>>>>>>>> + kvfree(heap_ftask->parray);
>>>>>>>>>>> + kfree(heap_ftask);
>>>>>>>>>>> + return fail;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +struct dma_heap_file_task *
>>>>>>>>>>> +dma_heap_declare_file_read(struct dma_heap_file *heap_file)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct dma_heap_file_task *heap_ftask =
>>>>>>>>>>> + kzalloc(sizeof(*heap_ftask), GFP_KERNEL);
>>>>>>>>>>> + if (unlikely(!heap_ftask))
>>>>>>>>>>> + return NULL;
>>>>>>>>>>> +
>>>>>>>>>>> + /**
>>>>>>>>>>> + * Batch is the maximum size which we prepare work will
>>>>>>>>>>> meet.
>>>>>>>>>>> + * So, direct alloc this number's page array is OK.
>>>>>>>>>>> + */
>>>>>>>>>>> + heap_ftask->parray =
>>>>>>>>>>> kvmalloc_array(heap_file->max_batch >> PAGE_SHIFT,
>>>>>>>>>>> + sizeof(struct page *), GFP_KERNEL);
>>>>>>>>>>> + if (unlikely(!heap_ftask->parray))
>>>>>>>>>>> + goto put;
>>>>>>>>>>> +
>>>>>>>>>>> + heap_ftask->heap_file = heap_file;
>>>>>>>>>>> + heap_ftask->rbatch = heap_file->max_batch;
>>>>>>>>>>> + return heap_ftask;
>>>>>>>>>>> +put:
>>>>>>>>>>> + kfree(heap_ftask);
>>>>>>>>>>> + return NULL;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static void __work_this_io(struct dma_heap_file_work
>>>>>>>>>>> *heap_fwork)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct dma_heap_file *heap_file = heap_fwork->heap_file;
>>>>>>>>>>> + struct file *file = heap_file->file;
>>>>>>>>>>> + ssize_t start = heap_fwork->start_size;
>>>>>>>>>>> + ssize_t size = heap_fwork->need_size;
>>>>>>>>>>> + void *buffer = heap_fwork->vaddr;
>>>>>>>>>>> + const struct cred *old_cred;
>>>>>>>>>>> + ssize_t err;
>>>>>>>>>>> +
>>>>>>>>>>> + // use real task's cred to read this file.
>>>>>>>>>>> + old_cred = override_creds(heap_file->cred);
>>>>>>>>>>> + err = kernel_read_file(file, start, &buffer, size,
>>>>>>>>>>> &heap_file->fsz,
>>>>>>>>>>> + READING_POLICY);
>>>>>>>>>>> + if (err < 0) {
>>>>>>>>>>> + pr_err("use kernel_read_file, err=%ld, [%ld, %ld],
>>>>>>>>>>> f_sz=%ld\n",
>>>>>>>>>>> + err, start, (start + size), heap_file->fsz);
>>>>>>>>>>> + WRITE_ONCE(*heap_fwork->failp, true);
>>>>>>>>>>> + }
>>>>>>>>>>> + // recovery to my cred.
>>>>>>>>>>> + revert_creds(old_cred);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static int dma_heap_file_control_thread(void *data)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct dma_heap_file_control *heap_fctl =
>>>>>>>>>>> + (struct dma_heap_file_control *)data;
>>>>>>>>>>> + struct dma_heap_file_work *worker, *tmp;
>>>>>>>>>>> + int nr_work;
>>>>>>>>>>> +
>>>>>>>>>>> + LIST_HEAD(pages);
>>>>>>>>>>> + LIST_HEAD(workers);
>>>>>>>>>>> +
>>>>>>>>>>> + while (true) {
>>>>>>>>>>> + wait_event_freezable(heap_fctl->threadwq,
>>>>>>>>>>> + atomic_read(&heap_fctl->nr_work) > 0);
>>>>>>>>>>> +recheck:
>>>>>>>>>>> + spin_lock(&heap_fctl->lock);
>>>>>>>>>>> + list_splice_init(&heap_fctl->works, &workers);
>>>>>>>>>>> + spin_unlock(&heap_fctl->lock);
>>>>>>>>>>> +
>>>>>>>>>>> + if (unlikely(kthread_should_stop())) {
>>>>>>>>>>> + list_for_each_entry_safe(worker, tmp, &workers,
>>>>>>>>>>> list) {
>>>>>>>>>>> + list_del(&worker->list);
>>>>>>>>>>> + destroy_file_work(worker);
>>>>>>>>>>> + }
>>>>>>>>>>> + break;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + nr_work = 0;
>>>>>>>>>>> + list_for_each_entry_safe(worker, tmp, &workers,
>>>>>>>>>>> list) {
>>>>>>>>>>> + ++nr_work;
>>>>>>>>>>> + list_del(&worker->list);
>>>>>>>>>>> + __work_this_io(worker);
>>>>>>>>>>> +
>>>>>>>>>>> + destroy_file_work(worker);
>>>>>>>>>>> + }
>>>>>>>>>>> + atomic_sub(nr_work, &heap_fctl->nr_work);
>>>>>>>>>>> +
>>>>>>>>>>> + if (atomic_read(&heap_fctl->nr_work) > 0)
>>>>>>>>>>> + goto recheck;
>>>>>>>>>>> + }
>>>>>>>>>>> + return 0;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +size_t dma_heap_file_size(struct dma_heap_file *heap_file)
>>>>>>>>>>> +{
>>>>>>>>>>> + return heap_file->fsz;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static int prepare_dma_heap_file(struct dma_heap_file
>>>>>>>>>>> *heap_file, int file_fd,
>>>>>>>>>>> + size_t batch)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct file *file;
>>>>>>>>>>> + size_t fsz;
>>>>>>>>>>> + int ret;
>>>>>>>>>>> +
>>>>>>>>>>> + file = fget(file_fd);
>>>>>>>>>>> + if (!file)
>>>>>>>>>>> + return -EINVAL;
>>>>>>>>>>> +
>>>>>>>>>>> + fsz = i_size_read(file_inode(file));
>>>>>>>>>>> + if (fsz < batch) {
>>>>>>>>>>> + ret = -EINVAL;
>>>>>>>>>>> + goto err;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + /**
>>>>>>>>>>> + * Selinux block our read, but actually we are reading
>>>>>>>>>>> the stand-in
>>>>>>>>>>> + * for this file.
>>>>>>>>>>> + * So save current's cred and when going to read,
>>>>>>>>>>> override mine, and
>>>>>>>>>>> + * end of read, revert.
>>>>>>>>>>> + */
>>>>>>>>>>> + heap_file->cred = prepare_kernel_cred(current);
>>>>>>>>>>> + if (unlikely(!heap_file->cred)) {
>>>>>>>>>>> + ret = -ENOMEM;
>>>>>>>>>>> + goto err;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + heap_file->file = file;
>>>>>>>>>>> + heap_file->max_batch = batch;
>>>>>>>>>>> + heap_file->fsz = fsz;
>>>>>>>>>>> +
>>>>>>>>>>> + heap_file->direct = file->f_flags & O_DIRECT;
>>>>>>>>>>> +
>>>>>>>>>>> +#define DMA_HEAP_SUGGEST_DIRECT_IO_SIZE (1UL << 30)
>>>>>>>>>>> + if (!heap_file->direct && fsz >=
>>>>>>>>>>> DMA_HEAP_SUGGEST_DIRECT_IO_SIZE)
>>>>>>>>>>> + pr_warn("alloc read file better to use O_DIRECT to
>>>>>>>>>>> read larget file\n");
>>>>>>>>>>> +
>>>>>>>>>>> + return 0;
>>>>>>>>>>> +
>>>>>>>>>>> +err:
>>>>>>>>>>> + fput(file);
>>>>>>>>>>> + return ret;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static void destroy_dma_heap_file(struct dma_heap_file
>>>>>>>>>>> *heap_file)
>>>>>>>>>>> +{
>>>>>>>>>>> + fput(heap_file->file);
>>>>>>>>>>> + put_cred(heap_file->cred);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static int dma_heap_buffer_alloc_read_file(struct dma_heap
>>>>>>>>>>> *heap, int file_fd,
>>>>>>>>>>> + size_t batch, unsigned int fd_flags,
>>>>>>>>>>> + unsigned int heap_flags)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct dma_buf *dmabuf;
>>>>>>>>>>> + int fd;
>>>>>>>>>>> + struct dma_heap_file heap_file;
>>>>>>>>>>> +
>>>>>>>>>>> + fd = prepare_dma_heap_file(&heap_file, file_fd, batch);
>>>>>>>>>>> + if (fd)
>>>>>>>>>>> + goto error_file;
>>>>>>>>>>> +
>>>>>>>>>>> + dmabuf = heap->ops->allocate_read_file(heap,
>>>>>>>>>>> &heap_file, fd_flags,
>>>>>>>>>>> + heap_flags);
>>>>>>>>>>> + if (IS_ERR(dmabuf)) {
>>>>>>>>>>> + fd = PTR_ERR(dmabuf);
>>>>>>>>>>> + goto error;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + fd = dma_buf_fd(dmabuf, fd_flags);
>>>>>>>>>>> + if (fd < 0) {
>>>>>>>>>>> + dma_buf_put(dmabuf);
>>>>>>>>>>> + /* just return, as put will call release and that
>>>>>>>>>>> will free */
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> +error:
>>>>>>>>>>> + destroy_dma_heap_file(&heap_file);
>>>>>>>>>>> +error_file:
>>>>>>>>>>> + return fd;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> static int dma_heap_buffer_alloc(struct dma_heap *heap,
>>>>>>>>>>> size_t len,
>>>>>>>>>>> u32 fd_flags,
>>>>>>>>>>> u64 heap_flags)
>>>>>>>>>>> @@ -93,6 +545,38 @@ static int dma_heap_open(struct inode
>>>>>>>>>>> *inode, struct file *file)
>>>>>>>>>>> return 0;
>>>>>>>>>>> }
>>>>>>>>>>> +static long dma_heap_ioctl_allocate_read_file(struct file
>>>>>>>>>>> *file, void *data)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct dma_heap_allocation_file_data
>>>>>>>>>>> *heap_allocation_file = data;
>>>>>>>>>>> + struct dma_heap *heap = file->private_data;
>>>>>>>>>>> + int fd;
>>>>>>>>>>> +
>>>>>>>>>>> + if (heap_allocation_file->fd ||
>>>>>>>>>>> !heap_allocation_file->file_fd)
>>>>>>>>>>> + return -EINVAL;
>>>>>>>>>>> +
>>>>>>>>>>> + if (heap_allocation_file->fd_flags &
>>>>>>>>>>> ~DMA_HEAP_VALID_FD_FLAGS)
>>>>>>>>>>> + return -EINVAL;
>>>>>>>>>>> +
>>>>>>>>>>> + if (heap_allocation_file->heap_flags &
>>>>>>>>>>> ~DMA_HEAP_VALID_HEAP_FLAGS)
>>>>>>>>>>> + return -EINVAL;
>>>>>>>>>>> +
>>>>>>>>>>> + if (!heap->ops->allocate_read_file)
>>>>>>>>>>> + return -EINVAL;
>>>>>>>>>>> +
>>>>>>>>>>> + fd = dma_heap_buffer_alloc_read_file(
>>>>>>>>>>> + heap, heap_allocation_file->file_fd,
>>>>>>>>>>> + heap_allocation_file->batch ?
>>>>>>>>>>> + PAGE_ALIGN(heap_allocation_file->batch) :
>>>>>>>>>>> + DEFAULT_ADI_BATCH,
>>>>>>>>>>> + heap_allocation_file->fd_flags,
>>>>>>>>>>> + heap_allocation_file->heap_flags);
>>>>>>>>>>> + if (fd < 0)
>>>>>>>>>>> + return fd;
>>>>>>>>>>> +
>>>>>>>>>>> + heap_allocation_file->fd = fd;
>>>>>>>>>>> + return 0;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> static long dma_heap_ioctl_allocate(struct file *file,
>>>>>>>>>>> void *data)
>>>>>>>>>>> {
>>>>>>>>>>> struct dma_heap_allocation_data *heap_allocation = data;
>>>>>>>>>>> @@ -121,6 +605,7 @@ static long
>>>>>>>>>>> dma_heap_ioctl_allocate(struct file *file, void *data)
>>>>>>>>>>> static unsigned int dma_heap_ioctl_cmds[] = {
>>>>>>>>>>> DMA_HEAP_IOCTL_ALLOC,
>>>>>>>>>>> + DMA_HEAP_IOCTL_ALLOC_AND_READ,
>>>>>>>>>>> };
>>>>>>>>>>> static long dma_heap_ioctl(struct file *file, unsigned
>>>>>>>>>>> int ucmd,
>>>>>>>>>>> @@ -170,6 +655,9 @@ static long dma_heap_ioctl(struct file
>>>>>>>>>>> *file, unsigned int ucmd,
>>>>>>>>>>> case DMA_HEAP_IOCTL_ALLOC:
>>>>>>>>>>> ret = dma_heap_ioctl_allocate(file, kdata);
>>>>>>>>>>> break;
>>>>>>>>>>> + case DMA_HEAP_IOCTL_ALLOC_AND_READ:
>>>>>>>>>>> + ret = dma_heap_ioctl_allocate_read_file(file, kdata);
>>>>>>>>>>> + break;
>>>>>>>>>>> default:
>>>>>>>>>>> ret = -ENOTTY;
>>>>>>>>>>> goto err;
>>>>>>>>>>> @@ -316,11 +804,44 @@ static int dma_heap_init(void)
>>>>>>>>>>> dma_heap_class = class_create(DEVNAME);
>>>>>>>>>>> if (IS_ERR(dma_heap_class)) {
>>>>>>>>>>> - unregister_chrdev_region(dma_heap_devt, NUM_HEAP_MINORS);
>>>>>>>>>>> - return PTR_ERR(dma_heap_class);
>>>>>>>>>>> + ret = PTR_ERR(dma_heap_class);
>>>>>>>>>>> + goto fail_class;
>>>>>>>>>>> }
>>>>>>>>>>> dma_heap_class->devnode = dma_heap_devnode;
>>>>>>>>>>> + heap_fctl = kzalloc(sizeof(*heap_fctl), GFP_KERNEL);
>>>>>>>>>>> + if (unlikely(!heap_fctl)) {
>>>>>>>>>>> + ret = -ENOMEM;
>>>>>>>>>>> + goto fail_alloc;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + INIT_LIST_HEAD(&heap_fctl->works);
>>>>>>>>>>> + init_waitqueue_head(&heap_fctl->threadwq);
>>>>>>>>>>> + init_waitqueue_head(&heap_fctl->workwq);
>>>>>>>>>>> +
>>>>>>>>>>> + heap_fctl->work_thread =
>>>>>>>>>>> kthread_run(dma_heap_file_control_thread,
>>>>>>>>>>> + heap_fctl, "heap_fwork_t");
>>>>>>>>>>> + if (IS_ERR(heap_fctl->work_thread)) {
>>>>>>>>>>> + ret = -ENOMEM;
>>>>>>>>>>> + goto fail_thread;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + heap_fctl->heap_fwork_cachep =
>>>>>>>>>>> KMEM_CACHE(dma_heap_file_work, 0);
>>>>>>>>>>> + if (unlikely(!heap_fctl->heap_fwork_cachep)) {
>>>>>>>>>>> + ret = -ENOMEM;
>>>>>>>>>>> + goto fail_cache;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> return 0;
>>>>>>>>>>> +
>>>>>>>>>>> +fail_cache:
>>>>>>>>>>> + kthread_stop(heap_fctl->work_thread);
>>>>>>>>>>> +fail_thread:
>>>>>>>>>>> + kfree(heap_fctl);
>>>>>>>>>>> +fail_alloc:
>>>>>>>>>>> + class_destroy(dma_heap_class);
>>>>>>>>>>> +fail_class:
>>>>>>>>>>> + unregister_chrdev_region(dma_heap_devt, NUM_HEAP_MINORS);
>>>>>>>>>>> + return ret;
>>>>>>>>>>> }
>>>>>>>>>>> subsys_initcall(dma_heap_init);
>>>>>>>>>>> diff --git a/include/linux/dma-heap.h
>>>>>>>>>>> b/include/linux/dma-heap.h
>>>>>>>>>>> index 064bad725061..9c25383f816c 100644
>>>>>>>>>>> --- a/include/linux/dma-heap.h
>>>>>>>>>>> +++ b/include/linux/dma-heap.h
>>>>>>>>>>> @@ -12,12 +12,17 @@
>>>>>>>>>>> #include <linux/cdev.h>
>>>>>>>>>>> #include <linux/types.h>
>>>>>>>>>>> +#define DEFAULT_ADI_BATCH (128 << 20)
>>>>>>>>>>> +
>>>>>>>>>>> struct dma_heap;
>>>>>>>>>>> +struct dma_heap_file_task;
>>>>>>>>>>> +struct dma_heap_file;
>>>>>>>>>>> /**
>>>>>>>>>>> * struct dma_heap_ops - ops to operate on a given heap
>>>>>>>>>>> * @allocate: allocate dmabuf and return struct
>>>>>>>>>>> dma_buf ptr
>>>>>>>>>>> - *
>>>>>>>>>>> + * @allocate_read_file: allocate dmabuf and read file, then
>>>>>>>>>>> return struct
>>>>>>>>>>> + * dma_buf ptr.
>>>>>>>>>>> * allocate returns dmabuf on success, ERR_PTR(-errno) on
>>>>>>>>>>> error.
>>>>>>>>>>> */
>>>>>>>>>>> struct dma_heap_ops {
>>>>>>>>>>> @@ -25,6 +30,11 @@ struct dma_heap_ops {
>>>>>>>>>>> unsigned long len,
>>>>>>>>>>> u32 fd_flags,
>>>>>>>>>>> u64 heap_flags);
>>>>>>>>>>> +
>>>>>>>>>>> + struct dma_buf *(*allocate_read_file)(struct dma_heap
>>>>>>>>>>> *heap,
>>>>>>>>>>> + struct dma_heap_file *heap_file,
>>>>>>>>>>> + u32 fd_flags,
>>>>>>>>>>> + u64 heap_flags);
>>>>>>>>>>> };
>>>>>>>>>>> /**
>>>>>>>>>>> @@ -65,4 +75,49 @@ const char *dma_heap_get_name(struct
>>>>>>>>>>> dma_heap *heap);
>>>>>>>>>>> */
>>>>>>>>>>> struct dma_heap *dma_heap_add(const struct
>>>>>>>>>>> dma_heap_export_info *exp_info);
>>>>>>>>>>> +/**
>>>>>>>>>>> + * dma_heap_destroy_file_read - waits for a file read to
>>>>>>>>>>> complete then destroy it
>>>>>>>>>>> + * Returns: true if the file read failed, false otherwise
>>>>>>>>>>> + */
>>>>>>>>>>> +bool dma_heap_destroy_file_read(struct dma_heap_file_task
>>>>>>>>>>> *heap_ftask);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * dma_heap_wait_for_file_read - waits for a file read to
>>>>>>>>>>> complete
>>>>>>>>>>> + * Returns: true if the file read failed, false otherwise
>>>>>>>>>>> + */
>>>>>>>>>>> +bool dma_heap_wait_for_file_read(struct dma_heap_file_task
>>>>>>>>>>> *heap_ftask);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * dma_heap_alloc_file_read - Declare a task to read file
>>>>>>>>>>> when allocate pages.
>>>>>>>>>>> + * @heap_file: target file to read
>>>>>>>>>>> + *
>>>>>>>>>>> + * Return NULL if failed, otherwise return a struct pointer.
>>>>>>>>>>> + */
>>>>>>>>>>> +struct dma_heap_file_task *
>>>>>>>>>>> +dma_heap_declare_file_read(struct dma_heap_file *heap_file);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * dma_heap_prepare_file_read - cache each allocated page
>>>>>>>>>>> until we meet this batch.
>>>>>>>>>>> + * @heap_ftask: prepared and need to commit's work.
>>>>>>>>>>> + * @page: current allocated page. don't care which
>>>>>>>>>>> order.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Returns true if reach to batch, false so go on prepare.
>>>>>>>>>>> + */
>>>>>>>>>>> +bool dma_heap_prepare_file_read(struct dma_heap_file_task
>>>>>>>>>>> *heap_ftask,
>>>>>>>>>>> + struct page *page);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * dma_heap_commit_file_read - prepare collect enough
>>>>>>>>>>> memory, going to trigger IO
>>>>>>>>>>> + * @heap_ftask: info that current IO needs
>>>>>>>>>>> + *
>>>>>>>>>>> + * This commit will also check if reach to tail read.
>>>>>>>>>>> + * For direct I/O submissions, it is necessary to pay
>>>>>>>>>>> attention to file reads
>>>>>>>>>>> + * that are not page-aligned. For the unaligned portion of
>>>>>>>>>>> the read, buffer IO
>>>>>>>>>>> + * needs to be triggered.
>>>>>>>>>>> + * Returns:
>>>>>>>>>>> + * 0 if all right, -errno if something wrong
>>>>>>>>>>> + */
>>>>>>>>>>> +int dma_heap_submit_file_read(struct dma_heap_file_task
>>>>>>>>>>> *heap_ftask);
>>>>>>>>>>> +size_t dma_heap_file_size(struct dma_heap_file *heap_file);
>>>>>>>>>>> +
>>>>>>>>>>> #endif /* _DMA_HEAPS_H */
>>>>>>>>>>> diff --git a/include/uapi/linux/dma-heap.h
>>>>>>>>>>> b/include/uapi/linux/dma-heap.h
>>>>>>>>>>> index a4cf716a49fa..8c20e8b74eed 100644
>>>>>>>>>>> --- a/include/uapi/linux/dma-heap.h
>>>>>>>>>>> +++ b/include/uapi/linux/dma-heap.h
>>>>>>>>>>> @@ -39,6 +39,27 @@ struct dma_heap_allocation_data {
>>>>>>>>>>> __u64 heap_flags;
>>>>>>>>>>> };
>>>>>>>>>>> +/**
>>>>>>>>>>> + * struct dma_heap_allocation_file_data - metadata passed
>>>>>>>>>>> from userspace for
>>>>>>>>>>> + * allocations and read file
>>>>>>>>>>> + * @fd: will be populated with a fd which
>>>>>>>>>>> provides the
>>>>>>>>>>> + * �� handle to the allocated dma-buf
>>>>>>>>>>> + * @file_fd: file descriptor to read from(suggested
>>>>>>>>>>> to use O_DIRECT open file)
>>>>>>>>>>> + * @batch: how many memory alloced then file
>>>>>>>>>>> read(bytes), default 128MB
>>>>>>>>>>> + * will auto aligned to PAGE_SIZE
>>>>>>>>>>> + * @fd_flags: file descriptor flags used when
>>>>>>>>>>> allocating
>>>>>>>>>>> + * @heap_flags: flags passed to heap
>>>>>>>>>>> + *
>>>>>>>>>>> + * Provided by userspace as an argument to the ioctl
>>>>>>>>>>> + */
>>>>>>>>>>> +struct dma_heap_allocation_file_data {
>>>>>>>>>>> + __u32 fd;
>>>>>>>>>>> + __u32 file_fd;
>>>>>>>>>>> + __u32 batch;
>>>>>>>>>>> + __u32 fd_flags;
>>>>>>>>>>> + __u64 heap_flags;
>>>>>>>>>>> +};
>>>>>>>>>>> +
>>>>>>>>>>> #define DMA_HEAP_IOC_MAGIC 'H'
>>>>>>>>>>> /**
>>>>>>>>>>> @@ -50,4 +71,15 @@ struct dma_heap_allocation_data {
>>>>>>>>>>> #define DMA_HEAP_IOCTL_ALLOC _IOWR(DMA_HEAP_IOC_MAGIC, 0x0,\
>>>>>>>>>>> struct dma_heap_allocation_data)
>>>>>>>>>>> +/**
>>>>>>>>>>> + * DOC: DMA_HEAP_IOCTL_ALLOC_AND_READ - allocate memory
>>>>>>>>>>> from pool and both
>>>>>>>>>>> + * read file when allocate memory.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Takes a dma_heap_allocation_file_data struct and returns
>>>>>>>>>>> it with the fd field
>>>>>>>>>>> + * populated with the dmabuf handle of the allocation. When
>>>>>>>>>>> return, the dma-buf
>>>>>>>>>>> + * content is read from file.
>>>>>>>>>>> + */
>>>>>>>>>>> +#define DMA_HEAP_IOCTL_ALLOC_AND_READ \
>>>>>>>>>>> + _IOWR(DMA_HEAP_IOC_MAGIC, 0x1, struct
>>>>>>>>>>> dma_heap_allocation_file_data)
>>>>>>>>>>> +
>>>>>>>>>>> #endif /* _UAPI_LINUX_DMABUF_POOL_H */
>>>>>>>>>>
>>>>>>>>
>>>>>
>>>
>
Powered by blists - more mailing lists