[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240811132411.GB13736@breakpoint.cc>
Date: Sun, 11 Aug 2024 15:24:11 +0200
From: Florian Westphal <fw@...len.de>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: syzbot+8ea26396ff85d23a8929@...kaller.appspotmail.com,
davem@...emloft.net, dsahern@...nel.org, edumazet@...gle.com,
kuba@...nel.org, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, pabeni@...hat.com,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [net?] WARNING: refcount bug in inet_twsk_kill
Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
> From: syzbot <syzbot+8ea26396ff85d23a8929@...kaller.appspotmail.com>
> Date: Sat, 10 Aug 2024 18:29:20 -0700
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 33e02dc69afb Merge tag 'sound-6.10-rc1' of git://git.kerne..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=117f3182980000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=25544a2faf4bae65
> > dashboard link: https://syzkaller.appspot.com/bug?extid=8ea26396ff85d23a8929
> > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > Downloadable assets:
> > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-33e02dc6.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/573c88ac3233/vmlinux-33e02dc6.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/760a52b9a00a/bzImage-33e02dc6.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+8ea26396ff85d23a8929@...kaller.appspotmail.com
> >
> > ------------[ cut here ]------------
> > refcount_t: decrement hit 0; leaking memory.
> > WARNING: CPU: 3 PID: 1396 at lib/refcount.c:31 refcount_warn_saturate+0x1ed/0x210 lib/refcount.c:31
>
> Eric, this is the weird report I was talking about at netdevconf :)
>
> It seems refcount_dec(&tw->tw_dr->tw_refcount) is somehow done earlier
> than refcount_inc().
>
> I started to see the same splat at a very low rate after consuming
> commit b334b924c9b7 ("net: tcp/dccp: prepare for tw_timer un-pinning").
I think I see why.
The reported splat is without this above commit.
But from backtrace we entered here:
if (net->ipv4.tcp_death_row.hashinfo->pernet) {
/* Even if tw_refcount == 1, we must clean up kernel reqsk */
inet_twsk_purge(net->ipv4.tcp_death_row.hashinfo);
} else if (!purged_once) {
--------------> inet_twsk_purge(&tcp_hashinfo); // THIS
purged_once = true;
> The commit a bit deferred refcount_inc(tw_refcount) after the hash dance,
> so twsk is now visible before tw_dr->tw_refcount is incremented.
>
> I came up with the diff below but was suspecting a bug in another place,
> possibly QEMU, so I haven't posted the diff officially.
>
> refcount_inc() was actually deferred, but it's still under an ehash lock,
but different struct inet_hashinfo, so the locks don't help :/
Powered by blists - more mailing lists