linux-kernel - Re: [PATCH] gfs2: Fix memory leak in gfs2_trans

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHc6FU5E2UNyhLuwJVdv3xnHEcM_Si2oX06S+5mz_3uz3-QqnA@mail.gmail.com>
Date: Sat, 8 Nov 2025 21:00:36 +0100
From: Andreas Gruenbacher <agruenba@...hat.com>
To: Edward Adam Davis <eadavis@...com>
Cc: syzbot+63ba84f14f62e61a5fd0@...kaller.appspotmail.com, 
	gfs2@...ts.linux.dev, linux-kernel@...r.kernel.org, 
	syzkaller-bugs@...glegroups.com
Subject: Re: [PATCH] gfs2: Fix memory leak in gfs2_trans_begin

Hello,

On Sat, Nov 8, 2025 at 10:13 AM Edward Adam Davis <eadavis@...com> wrote:
> According to log [1], a "bad magic number" was found when checking the
> metatype, which caused gfs2 withdraw.
>
> The root cause of the problem is: log flush treats non-delayed withdraw
> as withdraw, resulting in no one reclaiming the memory of transaction.
> See the call stack below for details.
>
>         CPU1                                    CPU2
>         ====                                    ====
> gfs2_meta_buffer()
> gfs2_metatype_check()
> gfs2_metatype_check_i()
> gfs2_metatype_check_ii()                gfs2_log_flush()
> gfs2_withdraw()                         tr = sdp->sd_log_tr
> signal_our_withdraw()                   sdp->sd_log_tr = NULL
> gfs2_ail_drain()                        goto out_withdraw
> spin_unlock(&sdp->sd_ail_lock)          trans_drain()
>                                         spin_lock(&sdp->sd_ail_lock)
>                                         list_add(&tr->tr_list, &sdp->sd_ail1_list)
>                                         tr = NULL
>                                         goto out_end
>

this bug report is against upstream commit c2c2ccfd4ba7, which
precedes the withdraw rework on gfs2's for-next branch. With those
patches, the race you are describing is no longer possible because
do_withdraw() now uses sdp->sd_log_flush_lock and the SDF_JOURNAL_LIVE
flag to synchronize with gfs2_log_flush().

I don't know why Bob chose to push the transaction onto the ail1 list
instead of freeing it in gfs2_log_flush(); that's something to clean
up. I've pushed an untested patch doing that to for-later.

Related commits:
58e08e8d83ab ("gfs2: fix trans slab error when withdraw occurs inside
log_flush")
f5456b5d67cf ("gfs2: Clean up revokes on normal withdraws")

Thanks,
Andreas

> The original text suggests adding a delayed withdraw check to handle
> transaction cases to avoid similar memory leaks.
>
> syzbot reported:
> [1]
> gfs2: fsid=syz:syz.0: fatal: invalid metadata block - bh = 9381 (bad magic number), function = gfs2_meta_buffer, file = fs/gfs2/meta_io.c, line = 499
>
> [2]
> BUG: memory leak
> unreferenced object 0xffff888126cf1000 (size 144):
>   backtrace (crc f56b339f):
>     gfs2_trans_begin+0x29/0xa0 fs/gfs2/trans.c:115
>     alloc_dinode fs/gfs2/inode.c:418 [inline]
>     gfs2_create_inode+0xca0/0x1890 fs/gfs2/inode.c:807
>
>
> Fixes: f5456b5d67cf ("gfs2: Clean up revokes on normal withdraws")
> Reported-by: syzbot+63ba84f14f62e61a5fd0@...kaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=63ba84f14f62e61a5fd0
> Tested-by: syzbot+63ba84f14f62e61a5fd0@...kaller.appspotmail.com
> Signed-off-by: Edward Adam Davis <eadavis@...com>
> ---
>  fs/gfs2/log.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
> index 115c4ac457e9..7bba7951dbdb 100644
> --- a/fs/gfs2/log.c
> +++ b/fs/gfs2/log.c
> @@ -1169,11 +1169,13 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock *gl, u32 flags)
>          * never queued onto any of the ail lists. Here we add it to
>          * ail1 just so that ail_drain() will find and free it.
>          */
> -       spin_lock(&sdp->sd_ail_lock);
> -       if (tr && list_empty(&tr->tr_list))
> -               list_add(&tr->tr_list, &sdp->sd_ail1_list);
> -       spin_unlock(&sdp->sd_ail_lock);
> -       tr = NULL;
> +       if (gfs2_withdrawing(sdp)) {
> +               spin_lock(&sdp->sd_ail_lock);
> +               if (tr && list_empty(&tr->tr_list))
> +                       list_add(&tr->tr_list, &sdp->sd_ail1_list);
> +               spin_unlock(&sdp->sd_ail_lock);
> +               tr = NULL;
> +       }
>         goto out_end;
>  }
>
> --
> 2.43.0
>