lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241104181413.GG35848@ziepe.ca>
Date: Mon, 4 Nov 2024 14:14:13 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: Aleksandr Nogikh <nogikh@...gle.com>
Cc: syzbot <syzbot+6dee15fdb0606ef7b6ba@...kaller.appspotmail.com>,
	leon@...nel.org, linux-kernel@...r.kernel.org,
	linux-rdma@...r.kernel.org, netdev@...r.kernel.org,
	syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [rdma?] INFO: task hung in add_one_compat_dev (3)

On Fri, Nov 01, 2024 at 04:55:32PM +0100, Aleksandr Nogikh wrote:
> Hi Jason,
> 
> On Tue, Oct 22, 2024 at 4:29 PM Jason Gunthorpe <jgg@...pe.ca> wrote:
> >
> > On Tue, Oct 22, 2024 at 12:39:27AM -0700, syzbot wrote:
> >
> > > 1 lock held by syz-executor/27959:
> > >  #0: ffffffff8fcbffc8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
> > >  #0: ffffffff8fcbffc8 (rtnl_mutex){+.+.}-{3:3}, at: __rtnl_newlink net/core/rtnetlink.c:3749 [inline]
> > >  #0: ffffffff8fcbffc8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_newlink+0xab7/0x20a0 net/core/rtnetlink.c:3772
> >
> > There is really something wrong with the new sykzaller reporting, can
> > someone fix it?
> >
> > The kernel log that shows the programs:
> >
> > https://syzkaller.appspot.com/x/log.txt?x=10d72727980000
> >
> > Doesn't have the word "newlink"/"new"/"link" etc, and yet there is an
> > executor clearly sitting in a newlink netlink callback when we
> > crashed.
> 
> These are likely coming from the network devices initialization code.
> When syzbot spins up a new syz-executor, it creates a lot of
> networking devices as one of the first steps.
> https://github.com/google/syzkaller/blob/f00eed24f2a1332b07fef1a353a439133978d97b/executor/common_linux.h#L1482

Which part of this is the syz-executor? Near the start of the VM
lifetime?

Or each time it does:

last executing test programs:

3m14.839622334s ago: executing program 3 (id=3291):
r0 = socket$nl_netfilter(0x10, 0x3, 0xc)
sendmsg$NFT_MSG_GETRULE(r0, &(0x7f0000000240)={0x0, 0x0, &(0x7f00000001c0)={&(0x7f0000000380)={0x20, 0x19, 0xa, 0x3, 0x0, 0x0, {}, [@NFTA_RULE_TABLE={0x9, 0x1, 'syz0\x00'}]}, 0x20}}, 0x0)

?

> So those syz-executors might have just been unable to start and then
> they were abandoned (?)

It seems unlikely.. The crash happened like this:

[  709.737594][   T30] INFO: task syz-executor:27961 blocked for more than 143 seconds.

So whatever killed it happened at approx 566 seconds into the test,
not when it booted.

Since the start of the "last executing test programs:" is only 3min
back, and the above is 9 min back, it probably helps explain why there
is no record.

> > We need to see the syzkaller programs that are triggering these issues
> > to get ideas, and for some reason they are missing now.
> 
> Once syzbot manages to find a reproducer, hopefully things will become
> more clear.

It never seems to find one for these kinds of things... The dashboard
says it happens almost daily.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ