linux-kernel - Re: [PATCH V3 4/5] ext4: get discard out of jbd2 commit kthread contex

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6621227f-e710-2fa2-bc09-c4c66c34eb14@gmail.com>
Date:   Thu, 26 Aug 2021 15:15:55 +0800
From:   Wang Jianchao <jianchao.wan9@...il.com>
To:     Jan Kara <jack@...e.cz>
Cc:     linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
        tytso@....edu, adilger.kernel@...ger.ca
Subject: Re: [PATCH V3 4/5] ext4: get discard out of jbd2 commit kthread
 contex



On 2021/8/4 11:45 PM, Jan Kara wrote:
> On Sat 24-07-21 15:41:23, Wang Jianchao wrote:
>> From: Wang Jianchao <wangjianchao@...ishou.com>
>>
>> Right now, discard is issued and waited to be completed in jbd2
>> commit kthread context after the logs are committed. When large
>> amount of files are deleted and discard is flooding, jbd2 commit
>> kthread can be blocked for long time. Then all of the metadata
>> operations can be blocked to wait the log space.
>>
>> One case is the page fault path with read mm->mmap_sem held, which
>> wants to update the file time but has to wait for the log space.
>> When other threads in the task wants to do mmap, then write mmap_sem
>> is blocked. Finally all of the following read mmap_sem requirements
>> are blocked, even the ps command which need to read the /proc/pid/
>> -cmdline. Our monitor service which needs to read /proc/pid/cmdline
>> used to be blocked for 5 mins.
>>
>> This patch frees the blocks back to buddy after commit and then do
>> discard in a async kworker context in fstrim fashion, namely,
>>  - mark blocks to be discarded as used if they have not been allocated
>>  - do discard
>>  - mark them free
>> After this, jbd2 commit kthread won't be blocked any more by discard
>> and we won't get NOSPC even if the discard is slow or throttled.
>>
>> Link: https://marc.info/?l=linux-kernel&m=162143690731901&w=2
>> Suggested-by: Theodore Ts'o <tytso@....edu>
>> Signed-off-by: Wang Jianchao <wangjianchao@...ishou.com>
> 
> Looks good to me. Just one small comment below. With that addressed feel
> free to add:
> 
> Reviewed-by: Jan Kara <jack@...e.cz>
> 
> 
>> @@ -3474,6 +3530,14 @@ int ext4_mb_release(struct super_block *sb)
>>  	struct kmem_cache *cachep = get_groupinfo_cache(sb->s_blocksize_bits);
>>  	int count;
>>  
>> +	if (test_opt(sb, DISCARD)) {
>> +		/*
>> +		 * wait the discard work to drain all of ext4_free_data
>> +		 */
>> +		queue_work(ext4_discard_wq, &sbi->s_discard_work);
> 
> Do we really need to queue the work here? The filesystem should be
> quiescent by now, we take care to queue the work whenever we add item to
> empty list. So it should be enough to have flush_work() here and then
> possibly
> 
> 	WARN_ON_ONCE(!list_empty(&sbi->s_discard_list))
> 
> Or am I missing something?

queue_work here is indeed redundant.

Thanks so much for you point out this.
Jianchao

> 
> 								Honza
> 
>> +		flush_work(&sbi->s_discard_work);
>> +	}
>> +