[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240811132411.GB13736@breakpoint.cc>
Date: Sun, 11 Aug 2024 15:24:11 +0200
From: Florian Westphal <fw@...len.de>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: syzbot+8ea26396ff85d23a8929@...kaller.appspotmail.com,
	davem@...emloft.net, dsahern@...nel.org, edumazet@...gle.com,
	kuba@...nel.org, linux-kernel@...r.kernel.org,
	netdev@...r.kernel.org, pabeni@...hat.com,
	syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [net?] WARNING: refcount bug in inet_twsk_kill
Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
> From: syzbot <syzbot+8ea26396ff85d23a8929@...kaller.appspotmail.com>
> Date: Sat, 10 Aug 2024 18:29:20 -0700
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    33e02dc69afb Merge tag 'sound-6.10-rc1' of git://git.kerne..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=117f3182980000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=25544a2faf4bae65
> > dashboard link: https://syzkaller.appspot.com/bug?extid=8ea26396ff85d23a8929
> > compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > 
> > Unfortunately, I don't have any reproducer for this issue yet.
> > 
> > Downloadable assets:
> > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-33e02dc6.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/573c88ac3233/vmlinux-33e02dc6.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/760a52b9a00a/bzImage-33e02dc6.xz
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+8ea26396ff85d23a8929@...kaller.appspotmail.com
> > 
> > ------------[ cut here ]------------
> > refcount_t: decrement hit 0; leaking memory.
> > WARNING: CPU: 3 PID: 1396 at lib/refcount.c:31 refcount_warn_saturate+0x1ed/0x210 lib/refcount.c:31
> 
> Eric, this is the weird report I was talking about at netdevconf :)
> 
> It seems refcount_dec(&tw->tw_dr->tw_refcount) is somehow done earlier
> than refcount_inc().
> 
> I started to see the same splat at a very low rate after consuming
> commit b334b924c9b7 ("net: tcp/dccp: prepare for tw_timer un-pinning").
I think I see why.
The reported splat is without this above commit.
But from backtrace we entered here:
                if (net->ipv4.tcp_death_row.hashinfo->pernet) {
                        /* Even if tw_refcount == 1, we must clean up kernel reqsk */
                        inet_twsk_purge(net->ipv4.tcp_death_row.hashinfo);
                } else if (!purged_once) {
-------------->         inet_twsk_purge(&tcp_hashinfo);  // THIS
                        purged_once = true;
> The commit a bit deferred refcount_inc(tw_refcount) after the hash dance,
> so twsk is now visible before tw_dr->tw_refcount is incremented.
> 
> I came up with the diff below but was suspecting a bug in another place,
> possibly QEMU, so I haven't posted the diff officially.
> 
> refcount_inc() was actually deferred, but it's still under an ehash lock,
but different struct inet_hashinfo, so the locks don't help :/
Powered by blists - more mailing lists
 
