linux-kernel - Re: [PATCH] mm: memcontrol: fix GFP_NOFS recursion in memory.high enforcement

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALvZod7dzQh9O0CE9Xfu0n6m0dNjH33QQt9sCiQ8scDR1G4CKw@mail.gmail.com>
Date:   Thu, 14 Sep 2023 08:59:05 -0700
From:   Shakeel Butt <shakeelb@...gle.com>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Breno Leitao <leitao@...ian.org>,
        Josef Bacik <josef@...icpanda.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: memcontrol: fix GFP_NOFS recursion in memory.high enforcement

On Thu, Sep 14, 2023 at 8:21 AM Johannes Weiner <hannes@...xchg.org> wrote:
>
> Breno and Josef report a deadlock scenario from cgroup reclaim
> re-entering the filesystem:
>
> [  361.546690] ======================================================
> [  361.559210] WARNING: possible circular locking dependency detected
> [  361.571703] 6.5.0-0_fbk700_debug_rc0_kbuilder_13159_gbf787a128001 #1 Tainted: G S          E
> [  361.589704] ------------------------------------------------------
> [  361.602277] find/9315 is trying to acquire lock:
> [  361.611625] ffff88837ba140c0 (&delayed_node->mutex){+.+.}-{4:4}, at: __btrfs_release_delayed_node+0x68/0x4f0
> [  361.631437]
> [  361.631437] but task is already holding lock:
> [  361.643243] ffff8881765b8678 (btrfs-tree-01){++++}-{4:4}, at: btrfs_tree_read_lock+0x1e/0x40
>
> [  362.904457]  mutex_lock_nested+0x1c/0x30
> [  362.912414]  __btrfs_release_delayed_node+0x68/0x4f0
> [  362.922460]  btrfs_evict_inode+0x301/0x770
> [  362.982726]  evict+0x17c/0x380
> [  362.988944]  prune_icache_sb+0x100/0x1d0
> [  363.005559]  super_cache_scan+0x1f8/0x260
> [  363.013695]  do_shrink_slab+0x2a2/0x540
> [  363.021489]  shrink_slab_memcg+0x237/0x3d0
> [  363.050606]  shrink_slab+0xa7/0x240
> [  363.083382]  shrink_node_memcgs+0x262/0x3b0
> [  363.091870]  shrink_node+0x1a4/0x720
> [  363.099150]  shrink_zones+0x1f6/0x5d0
> [  363.148798]  do_try_to_free_pages+0x19b/0x5e0
> [  363.157633]  try_to_free_mem_cgroup_pages+0x266/0x370
> [  363.190575]  reclaim_high+0x16f/0x1f0
> [  363.208409]  mem_cgroup_handle_over_high+0x10b/0x270
> [  363.246678]  try_charge_memcg+0xaf2/0xc70
> [  363.304151]  charge_memcg+0xf0/0x350
> [  363.320070]  __mem_cgroup_charge+0x28/0x40
> [  363.328371]  __filemap_add_folio+0x870/0xd50
> [  363.371303]  filemap_add_folio+0xdd/0x310
> [  363.399696]  __filemap_get_folio+0x2fc/0x7d0
> [  363.419086]  pagecache_get_page+0xe/0x30
> [  363.427048]  alloc_extent_buffer+0x1cd/0x6a0
> [  363.435704]  read_tree_block+0x43/0xc0
> [  363.443316]  read_block_for_search+0x361/0x510
> [  363.466690]  btrfs_search_slot+0xc8c/0x1520
>
> This is caused by the mem_cgroup_handle_over_high() not respecting the
> gfp_mask of the allocation context. We used to only call this function
> on resume to userspace, where no locks were held. But c9afe31ec443
> ("memcg: synchronously enforce memory.high for large overcharges")
> added a call from the allocation context without considering the gfp.
>
> Reported-by: Breno Leitao <leitao@...ian.org>
> Reported-by: Josef Bacik <josef@...icpanda.com>
> Fixes: c9afe31ec443 ("memcg: synchronously enforce memory.high for large overcharges")
> Cc: stable@...r.kernel.org # 5.17+
> Signed-off-by: Johannes Weiner <hannes@...xchg.org>

Acked-by: Shakeel Butt <shakeelb@...gle.com>