[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFyQdHt0qYQ0uBSK+9wfPOyFfx8mJE4r9FfWoxa3Gyei9g@mail.gmail.com>
Date: Tue, 1 Aug 2017 10:20:31 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: "davej@...emonkey.org.uk" <davej@...emonkey.org.uk>,
Trond Myklebust <trondmy@...marydata.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"bfields@...ldses.org" <bfields@...ldses.org>,
"linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>,
"schumaker.anna@...il.com" <schumaker.anna@...il.com>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>
Subject: Re: [GIT PULL] Please pull NFS client changes for Linux 4.13
On Tue, Aug 1, 2017 at 8:51 AM, davej@...emonkey.org.uk
<davej@...emonkey.org.uk> wrote:
> On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
> > Any chance of getting the output from
> >
> > ./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0
>
>
> Hm, that points to this..
>
> 7463 /* Save the EXCHANGE_ID verifier session trunk tests */
> 7464 memcpy(clp->cl_confirm.data, cdata->args.verifier->data,
> 7465 sizeof(clp->cl_confirm.data));
Ok, that certainly made no sense to me, because the KASAN report made
it look like a stale pathname access (allocated in getname, freed in
putname), but I think the issue is more fundamental than that.
That cdata->args.verifier seems to be entirely broken. AT least for
the "xprt == NULL" case, it does the following:
- use the address of a local variable ("&verifier")
- wait for the rpc completion using rpc_wait_for_completion_task().
That's unacceptably buggy crap. rpc_wait_for_completion_task() will
happily exit on a deadly signal even if the rpc hasn't been completed,
so now you'll have a stale pointer to a stack that has been freed.
So I think the 'pathname' part may actually be entirely a red herring,
and it's the underlying access itself that just picks up a random
pointer from a stack that now contains something different. And KASAN
didn't notice the stale stack access itself, because the stack slot is
still valid - it's just no longer the original 'verifier' allocation.
Or *something* like that.
None of this looks even remotely new, though - the code seems to go
back to 2009. Have you just changed what you're testing to trigger
these things?
I'm not even sure why it does that stupid stack allocation. It does a
*real* allocation just a few lines later:
struct nfs41_exchange_id_data *calldata
...
calldata = kzalloc(sizeof(*calldata), GFP_NOFS);
and the whole verifier structure could easily have been part of that
same allocation as far as I can tell.
And that really might seem to be the right thing to do.
TOTALLY UNTESTED PROBABLY COMPLETE CRAP patch attatched.
That patch compiles for me. It *might* even work. Or it might just be
the ramblings of a diseased mind.
Anna? Trond?
So caveat probatorem,
Linus
View attachment "patch.diff" of type "text/plain" (1904 bytes)
Powered by blists - more mailing lists