[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191230083224.sbk2jspqmup43obs@yavin.dot.cyphar.com>
Date: Mon, 30 Dec 2019 19:32:24 +1100
From: Aleksa Sarai <cyphar@...har.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Al Viro <viro@...iv.linux.org.uk>,
David Howells <dhowells@...hat.com>,
Eric Biederman <ebiederm@...ssion.com>,
stable <stable@...r.kernel.org>,
Christian Brauner <christian.brauner@...ntu.com>,
Serge Hallyn <serge@...lyn.com>, dev@...ncontainers.org,
Linux Containers <containers@...ts.linux-foundation.org>,
Linux API <linux-api@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC 0/1] mount: universally disallow mounting over
symlinks
On 2019-12-29, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Sun, Dec 29, 2019 at 11:30 PM Aleksa Sarai <cyphar@...har.com> wrote:
> >
> > BUG: kernel NULL pointer dereference, address: 0000000000000000
>
> Would you mind building with debug info, and then running the oops through
>
> scripts/decode_stacktrace.sh
>
> which makes those addresses much more legible.
Will do.
> > #PF: supervisor instruction fetch in kernel mode
> > #PF: error_code(0x0010) - not-present page
>
> Somebody jumped through a NULL pointer.
>
> > RAX: 0000000000000000 RBX: ffff906d0cc3bb40 RCX: 0000000000000abc
> > RDX: 0000000000000089 RSI: ffff906d74623cc0 RDI: ffff906d74475df0
> > RBP: ffff906d74475df0 R08: ffffd70b7fb24c20 R09: ffff906d066a5000
> > R10: 0000000000000000 R11: 8080807fffffffff R12: ffff906d74623cc0
> > R13: 0000000000000089 R14: ffffb70b82963dc0 R15: 0000000000000080
> > FS: 00007fbc2a8f0540(0000) GS:ffff906dcf500000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: ffffffffffffffd6 CR3: 00000003c68f8001 CR4: 00000000003606e0
> > Call Trace:
> > __lookup_slow+0x94/0x160
>
> And "__lookup_slow()" has two indirect calls (they aren't obvious with
> retpoline, but look for something like
>
> call __x86_indirect_thunk_rax
>
> which is the modern sad way of doing "call *%rax"). One is for
> revalidatinging an old dentry, but the one I _suspect_ you trigger is
> this one:
>
> old = inode->i_op->lookup(inode, dentry, flags);
>
> but I thought we only could get here if we know it's a directory.
>
> How did we miss the "d_can_lookup()", which is what should check that
> yes, we can call that ->lookup() routine.
I'll try applying a trivial patch to add d_can_lookup() to see if it
fixes the immediate issue.
> This is why I have that suspicion that it's somehow that O_PATH fd
> opened in another process without O_PATH causes confusion...
>
> So what I think has happened is that because of the O_PATH thing,
> we've ended up with an inode that has never been truly opened (because
> O_PATH skips that part), but then with the /proc/<pid>/fd/xyz open, we
> now have a file descriptor that _looks_ like it is valid, and we're
> treating that inode as if it can be used.
I'm not sure I agree -- as I mentioned in my other mail, re-opening
through /proc/self/fd/$n works *very* well and has for a long time (in
fact, both LXC and runc depend on this working).
--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists