linux-kernel - Re: [RFC PATCH] jffs2: fix recursive fs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ff19df82-3fd4-9098-667c-0502b2452334@huawei.com>
Date: Fri, 15 Mar 2024 19:19:32 +0800
From: Zhihao Cheng <chengzhihao1@...wei.com>
To: Qingfang Deng <dqfext@...il.com>, David Woodhouse <dwmw2@...radead.org>,
	Richard Weinberger <richard@....at>, <linux-mtd@...ts.infradead.org>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] jffs2: fix recursive fs_reclaim deadlock

在 2024/3/15 15:59, Qingfang Deng 写道:
> When testing jffs2 on a memory-constrained system, lockdep detected a
> possible circular locking dependency.
> 
> kswapd0/266 is trying to acquire lock:
> ffffff802865e508 (&f->sem){+.+.}-{3:3}, at: jffs2_do_clear_inode+0x44/0x200
> 
> but task is already holding lock:
> ffffffd010e843c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x40
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (fs_reclaim){+.+.}-{0:0}:
>         lock_acquire+0x6c/0x90
>         fs_reclaim_acquire+0x7c/0xa0
>         kmem_cache_alloc+0x5c/0x400
>         jffs2_alloc_inode_cache+0x18/0x20
>         jffs2_do_read_inode+0x1e0/0x310
>         jffs2_iget+0x154/0x540
>         jffs2_do_fill_super+0x214/0x3f0
>         jffs2_fill_super+0x138/0x180
>         mtd_get_sb+0xcc/0x120
>         get_tree_mtd+0x168/0x400
>         jffs2_get_tree+0x14/0x20
>         vfs_get_tree+0x48/0x130
>         path_mount+0xa64/0x12d0
>         __arm64_sys_mount+0x368/0x3e0
>         do_el0_svc+0xa0/0x140
>         el0_svc+0x1c/0x30
>         el0_sync_handler+0x9c/0x120
>         el0_sync+0x148/0x180
> 
> -> #0 (&f->sem){+.+.}-{3:3}:
>         __lock_acquire+0x18cc/0x2bb0
>         lock_acquire.part.0+0x170/0x2e0
>         lock_acquire+0x6c/0x90
>         __mutex_lock+0x10c/0xaa0
>         mutex_lock_nested+0x54/0x80
>         jffs2_do_clear_inode+0x44/0x200
>         jffs2_evict_inode+0x44/0x50
>         evict+0x120/0x290
>         dispose_list+0x88/0xd0
>         prune_icache_sb+0xa8/0xd0
>         super_cache_scan+0x1c4/0x240
>         shrink_slab.constprop.0+0x2a0/0x7f0
>         shrink_node+0x398/0x8e0
>         balance_pgdat+0x268/0x550
>         kswapd+0x154/0x7c0
>         kthread+0x1f0/0x200
>         ret_from_fork+0x10/0x20
> 
I think it's a false positive warning. Jffs2 is trying to get root inode 
in process '#1', which means that the filesystem is not mounted 
yet(Because d_make_root is after jffs2_iget(sb,1), there is no way to 
access other inodes.), so it is impossible that jffs2 inode is being 
evicted in '#0'.
> other info that might help us debug this:
> 
>   Possible unsafe locking scenario:
> 
>         CPU0                    CPU1
>         ----                    ----
>    lock(fs_reclaim);
>                                 lock(&f->sem);
>                                 lock(fs_reclaim);
>    lock(&f->sem);
> 
>   *** DEADLOCK ***
> 
> 3 locks held by kswapd0/266:
>   #0: ffffffd010e843c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x40
>   #1: ffffffd010e62eb0 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab.constprop.0+0x78/0x7f0
>   #2: ffffff80225340e0 (&type->s_umount_key#40){.+.+}-{3:3}, at: super_cache_scan+0x3c/0x240
> 
> It turns out jffs2 uses GFP_KERNEL as the memory allocation flags
> throughout the code, and commonly, inside the critical section of
> jffs2_inode_info::sem. When running low on memory, any allocation within
> the critical section may trigger a direct reclaim, which recurses back
> to jffs2_do_clear_inode().
> 
> Replace GFP_KERNEL with GFP_NOFS to avoid the recursion.
> 
> Signed-off-by: Qingfang Deng <dqfext@...il.com>
> ---
> XXX: Posting this as RFC, as I don't know if all GFP_KERNEL occurrences
> should be replaced, or if this is just a false positive.