[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190703154543.GA21629@sol.localdomain>
Date: Wed, 3 Jul 2019 08:45:43 -0700
From: Eric Biggers <ebiggers@...nel.org>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Hillf Danton <hdanton@...a.com>,
syzbot <syzbot+d88a977731a9888db7ba@...kaller.appspotmail.com>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"syzkaller-bugs@...glegroups.com" <syzkaller-bugs@...glegroups.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
bpf@...r.kernel.org, Boris Pismenny <borisp@...lanox.com>,
Aviad Yehezkel <aviadye@...lanox.com>,
Dave Watson <davejwatson@...com>,
John Fastabend <john.fastabend@...il.com>
Subject: Re: kernel panic: corrupted stack end in dput
[+bpf and tls maintainers]
On Wed, Jul 03, 2019 at 04:23:34PM +0100, Al Viro wrote:
> On Wed, Jul 03, 2019 at 03:40:00PM +0100, Al Viro wrote:
> > On Wed, Jul 03, 2019 at 02:43:07PM +0800, Hillf Danton wrote:
> >
> > > > This is very much *NOT* fine.
> > > > 1) trylock can fail from any number of reasons, starting
> > > > with "somebody is going through the hash chain doing a lookup on
> > > > something completely unrelated"
> > >
> > > They are also a red light that we need to bail out of spiraling up
> > > the directory hierarchy imho.
> >
> > Translation: "let's leak the reference to parent, shall we?"
> >
> > > > 2) whoever had been holding the lock and whatever they'd
> > > > been doing might be over right after we get the return value from
> > > > spin_trylock().
> > >
> > > Or after we send a mail using git. I don't know.
> > >
> > > > 3) even had that been really somebody adding children in
> > > > the same parent *AND* even if they really kept doing that, rather
> > > > than unlocking and buggering off, would you care to explain why
> > > > dentry_unlist() called by __dentry_kill() and removing the victim
> > > > from the list of children would be safe to do in parallel with that?
> > > >
> > > My bad. I have to walk around that unsafety.
> >
> > WHAT unsafety? Can you explain what are you seeing and how to
> > reproduce it, whatever it is?
>
> BTW, what makes you think that it's something inside dput() itself?
> All I see is that at some point in the beginning of the loop body
> in dput() we observe a buggered stack.
>
> Is that the first iteration through the loop? IOW, is that just
> the place where we first notice preexisting corruption, or is
> that something the code called from that loop does? If it's
> a stack overflow, I would be very surprised to see it here -
> dput() is iterative and it's called on a very shallow stack in
> those traces.
>
> What happens if you e.g. turn that
> dput(dentry);
> in __fput() into
> rcu_read_lock(); rcu_read_unlock(); // trigger the check
> dput(dentry);
>
> and run your reporducer?
>
Please don't waste your time on this, it looks like just another report from the
massive memory corruption in BPF and/or TLS. Look at reproducer:
bpf$MAP_CREATE(0x0, &(0x7f0000000280)={0xf, 0x4, 0x4, 0x400, 0x0, 0x1}, 0x3c)
socket$rxrpc(0x21, 0x2, 0x800000000a)
r0 = socket$inet6_tcp(0xa, 0x1, 0x0)
setsockopt$inet6_tcp_int(r0, 0x6, 0x13, &(0x7f00000000c0)=0x100000001, 0x1d4)
connect$inet6(r0, &(0x7f0000000140), 0x1c)
bpf$MAP_CREATE(0x0, &(0x7f0000000000)={0x5}, 0xfffffffffffffdcb)
bpf$MAP_CREATE(0x2, &(0x7f0000003000)={0x3, 0x0, 0x77fffb, 0x0, 0x10020000000, 0x0}, 0x2c)
setsockopt$inet6_tcp_TCP_ULP(r0, 0x6, 0x1f, &(0x7f0000000040)='tls\x00', 0x4)
It's the same as like 20 other syzbot reports.
- Eric
Powered by blists - more mailing lists