[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1d03de75-26c4-4a58-af46-dafb319bed89@kernel.org>
Date: Fri, 25 Jul 2025 10:37:42 +0800
From: Chao Yu <chao@...nel.org>
To: hanqi <hanqi@...o.com>, Jens Axboe <axboe@...nel.dk>, jaegeuk@...nel.org
Cc: chao@...nel.org, linux-f2fs-devel@...ts.sourceforge.net,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] f2fs: f2fs supports uncached buffered I/O
On 7/25/2025 9:44 AM, hanqi wrote:
>
>
> 在 2025/7/24 21:09, Chao Yu 写道:
>> On 2025/7/16 16:27, hanqi wrote:
>>>
>>>
>>> 在 2025/7/16 11:43, Jens Axboe 写道:
>>>> On 7/15/25 9:34 PM, hanqi wrote:
>>>>>
>>>>> ? 2025/7/15 22:28, Jens Axboe ??:
>>>>>> On 7/14/25 9:10 PM, Qi Han wrote:
>>>>>>> Jens has already completed the development of uncached buffered I/O
>>>>>>> in git [1], and in f2fs, the feature can be enabled simply by
>>>>>>> setting
>>>>>>> the FOP_DONTCACHE flag in f2fs_file_operations.
>>>>>> You need to ensure that for any DONTCACHE IO that the completion is
>>>>>> routed via non-irq context, if applicable. I didn't verify that
>>>>>> this is
>>>>>> the case for f2fs. Generally you can deduce this as well through
>>>>>> testing, I'd say the following cases would be interesting to test:
>>>>>>
>>>>>> 1) Normal DONTCACHE buffered read
>>>>>> 2) Overwrite DONTCACHE buffered write
>>>>>> 3) Append DONTCACHE buffered write
>>>>>>
>>>>>> Test those with DEBUG_ATOMIC_SLEEP set in your config, and it that
>>>>>> doesn't complain, that's a great start.
>>>>>>
>>>>>> For the above test cases as well, verify that page cache doesn't
>>>>>> grow as
>>>>>> IO is performed. A bit is fine for things like meta data, but
>>>>>> generally
>>>>>> you want to see it remain basically flat in terms of page cache
>>>>>> usage.
>>>>>>
>>>>>> Maybe this is all fine, like I said I didn't verify. Just
>>>>>> mentioning it
>>>>>> for completeness sake.
>>>>> Hi, Jens
>>>>> Thanks for your suggestion. As I mentioned earlier in [1], in f2fs,
>>>>> the regular buffered write path invokes folio_end_writeback from a
>>>>> softirq context. Therefore, it seems that f2fs may not be suitable
>>>>> for DONTCACHE I/O writes.
>>>>>
>>>>> I?d like to ask a question: why is DONTCACHE I/O write restricted to
>>>>> non-interrupt context only? Is it because dropping the page might be
>>>>> too time-consuming to be done safely in interrupt context? This might
>>>>> be a naive question, but I?d really appreciate your clarification.
>>>>> Thanks in advance.
>>>> Because (as of right now, at least) the code doing the invalidation
>>>> needs process context. There are various reasons for this, which you'll
>>>> see if you follow the path off folio_end_writeback() ->
>>>> filemap_end_dropbehind_write() -> filemap_end_dropbehind() ->
>>>> folio_unmap_invalidate(). unmap_mapping_folio() is one case, and while
>>>> that may be doable, the inode i_lock is not IRQ safe.
>>>>
>>>> Most file systems have a need to punt some writeback completions to
>>>> non-irq context, eg for file extending etc. Hence for most file
>>>> systems,
>>>> the dontcache case just becomes another case that needs to go through
>>>> that path.
>>>>
>>>> It'd certainly be possible to improve upon this, for example by having
>>>> an opportunistic dontcache unmap from IRQ/soft-irq context, and then
>>>> punting to a workqueue if that doesn't pan out. But this doesn't exist
>>>> as of yet, hence the need for the workqueue punt.
>>
>> Thanks Jens for the detailed explanation.
>>
>>>
>>> Hi, Jens
>>> Thank you for your response. I tested uncached buffer I/O reads with
>>> a 50GB dataset on a local F2FS filesystem, and the page cache size
>>> only increased slightly, which I believe aligns with expectations.
>>> After clearing the page cache, the page cache size returned to its
>>> initial state. The test results are as follows:
>>>
>>> stat 50G.txt
>>> File: 50G.txt
>>> Size: 53687091200 Blocks: 104960712 IO Blocks: 512
>>> regular file
>>>
>>> [read before]:
>>> echo 3 > /proc/sys/vm/drop_caches
>>> 01:48:17 kbmemfree kbavail kbmemused %memused
>>> kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
>>> 01:50:59 6404648 8149508 2719384 23.40 512 1898092
>>> 199384760 823.75 1846756 466832 44
>>>
>>> ./uncached_io_test 8192 1 1 50G.txt
>>> Starting 1 threads
>>> reading bs 8192, uncached 1
>>> 1s: 754MB/sec, MB=754
>>> ...
>>> 64s: 844MB/sec, MB=262144
>>>
>>> [read after]:
>>> 01:52:33 6326664 8121240 2747968 23.65 728 1947656
>>> 199384788 823.75 1887896 502004 68
>>> echo 3 > /proc/sys/vm/drop_caches
>>> 01:53:11 6351136 8096936 2772400 23.86 512 1900500
>>> 199385216 823.75 1847252 533768 104
>>>
>>> Hi Chao,
>>> Given that F2FS currently calls folio_end_writeback in the softirq
>>> context for normal write scenarios, could we first support uncached
>>> buffer I/O reads? For normal uncached buffer I/O writes, would it be
>>> feasible for F2FS to introduce an asynchronous workqueue to handle the
>>> page drop operation in the future? What are your thoughts on this?
>>
>> Qi,
>>
>> Sorry for the delay.
>>
>> I think it will be good to support uncached buffered I/O in read path
>> first, and then let's take a look what we can do for write path, anyway,
>> let's do this step by step.
>>
>> Can you please update the patch?
>> - support read path only
>> - include test data in commit message
> Chao
>
> I will re-submit a patch to first enable F2FS support for uncached
> buffer I/O reads. Following that, I will work on implementing
> asynchronous page dropping in F2FS.
Qi, sure, please go ahead, thanks for the work. :)
Thanks,
>
> Thank you!
>>
>>> Thank you!
>>>
>>>
>
Powered by blists - more mailing lists