linux-ext4 - Re: Potential Linux Crash: WARNING in ext4_dirty

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Z-6cS9Cg1eN0w6XL@tiehlicka>
Date: Thu, 3 Apr 2025 16:33:47 +0200
From: Michal Hocko <mhocko@...e.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Matt Fleming <matt@...dmodwrite.com>, willy@...radead.org,
	adilger.kernel@...ger.ca, akpm@...ux-foundation.org,
	linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	luka.2016.cs@...il.com, tytso@....edu,
	Barry Song <baohua@...nel.org>, kernel-team@...udflare.com,
	Miklos Szeredi <miklos@...redi.hu>,
	Amir Goldstein <amir73il@...il.com>,
	Dave Chinner <david@...morbit.com>,
	Qi Zheng <zhengqi.arch@...edance.com>,
	Roman Gushchin <roman.gushchin@...ux.dev>,
	Muchun Song <muchun.song@...ux.dev>
Subject: Re: Potential Linux Crash: WARNING in ext4_dirty_folio in Linux
 kernel v6.13-rc5

On Thu 03-04-25 14:58:25, Vlastimil Babka wrote:
> On 4/3/25 14:29, Matt Fleming wrote:
> > On Wed, Mar 26, 2025 at 10:59 AM Matt Fleming <matt@...dmodwrite.com> wrote:
> >>
> >> Hi there,
> 
> + Cc also Michal
> 
> >> I'm also seeing this PF_MEMALLOC WARN triggered from kswapd in 6.12.19.
> 
> We're talking about __alloc_pages_slowpath() doing WARN_ON_ONCE(current-
> >flags & PF_MEMALLOC); for __GFP_NOFAIL allocations.
> 
> kswapd() sets:
> 
> tsk->flags |= PF_MEMALLOC | PF_KSWAPD;
> 
> so any __GFP_NOFAIL allocation done in the kswapd context risks this
> warning. It's also objectively bad IMHO because for direct reclaim we can
> loop and hope kswapd rescues us, but kswapd would then have to rely on
> direct reclaimers to get unstuck. I don't see an easy generic solution?

Right. I do not think NOFAIL request from the reclaim context is really
something we can commit to support. This really needs to be addressed on
the shrinker side.

> >> Does overlayfs need some kind of background inode reclaim support?
> > 
> > Hey everyone, I know there was some off-list discussion last week at
> > LSFMM, but I don't think a definite solution has been proposed for the
> > below stacktrace.
> > 
> > What is the shrinker API policy wrt memory allocation and I/O? Should
> > overlayfs do something more like XFS and background reclaim to avoid
> > GFP_NOFAIL
> > allocations when kswapd is shrinking caches?
> > 
> >>   Call Trace:
> >>    <TASK>
> >>    __alloc_pages_noprof+0x31c/0x330
> >>    alloc_pages_mpol_noprof+0xe3/0x1d0
> >>    folio_alloc_noprof+0x5b/0xa0
> >>    __filemap_get_folio+0x1f3/0x380
> >>    __getblk_slow+0xa3/0x1e0
> >>    __ext4_get_inode_loc+0x121/0x4b0
> >>    ext4_get_inode_loc+0x40/0xa0
> >>    ext4_reserve_inode_write+0x39/0xc0
> >>    __ext4_mark_inode_dirty+0x5b/0x220
> >>    ext4_evict_inode+0x26d/0x690
> >>    evict+0x112/0x2a0
> >>    __dentry_kill+0x71/0x180
> >>    dput+0xeb/0x1b0
> >>    ovl_stack_put+0x2e/0x50 [overlay]
> >>    ovl_destroy_inode+0x3a/0x60 [overlay]
> >>    destroy_inode+0x3b/0x70
> >>    __dentry_kill+0x71/0x180
> >>    shrink_dentry_list+0x6b/0xe0
> >>    prune_dcache_sb+0x56/0x80
> >>    super_cache_scan+0x12c/0x1e0
> >>    do_shrink_slab+0x13b/0x350
> >>    shrink_slab+0x278/0x3a0
> >>    shrink_node+0x328/0x880
> >>    balance_pgdat+0x36d/0x740
> >>    kswapd+0x1f0/0x380
> >>    kthread+0xd2/0x100
> >>    ret_from_fork+0x34/0x50
> >>    ret_from_fork_asm+0x1a/0x30
> >>    </TASK>
> > 

-- 
Michal Hocko
SUSE Labs