linux-kernel - Re: Kernel crash in free_pipe

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFzJWXUbS8v8bJADyZdVDw4XH4faCSTAydxfoW+zgmZ_JQ@mail.gmail.com>
Date:   Tue, 31 Oct 2017 12:00:21 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     Cong Wang <xiyou.wangcong@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: Kernel crash in free_pipe_info()

On Mon, Oct 30, 2017 at 9:44 PM, Al Viro <viro@...iv.linux.org.uk> wrote:
> On Mon, Oct 30, 2017 at 07:08:46PM -0700, Linus Torvalds wrote:
>>
>> Well, they're at 8(%rax), except for that last case.
>
> 0x10(%rax)?

Duh, yes.

>> Except the offset is that %r12*0x28+0x10, so we're talking a byte
>> offset of 330 bytes into the allocation, and apparently the eight
>> previous (0-7) iterations were fine.
>>
>> Which is really odd.
>
> I wonder what pipe->buffers is equal to here...

Sadly, we never even bother loading it so it doesn't show up in the
register state, we just iterate over the whole table.

The (one) ppc oops that looks like it might be the same issue has a
totally different pattern. Instead of having that "error number"
looking thing as the invalid pointer base, it has the magic number
from a spinlock. And rather than being about "pipe->bufs[]" array,
it's the pipe pointer itself that seems corrupted, and thus the oops
happens in the account_pipe_buffers() code instead of in the loop over
the buffers.

Of course, both are consistent with that "pipe_inode_info" simply
having been overwritten by something else (possibly, but not
necessarily, due to a use-after-free).

> FWIW, I would try to slap
>         if (buf->ops && (unsigned long)buf->ops <= 0xffffffff)
>                 dump the living hell out of that thing
> and see what it catches...

Actually, I'm looking at *another* error path - the one in named pipes.

Named pipes are why we do that nasty "inode->i_pipe" thing - if it was
for just the regular pipes, we'd be able to just do
file->f_private_data and be done with it. But named pipes have to have
the pipe data associated with a particular inode.

And that code actually does look wrong.

Look at fifo_open(): it increments the pipe->files as it sets
filp->private_data to point to the pipe_inode_info. Good so far.

But look at the error case. It does that put_pipe_info() to release it
again, but filp->private_data still contains the pipe_inode_info
pointer.

So what happens on a failed open of a named pipe? The *normal* code
all is very careful to _not_ use "fput()", but instead use
"put_filp(f)", which will just free the file pointer.

But what if somebody does "vfs_open()" on one of those things, and
then does "fput()" in the failure case?

In "do_last()" we have that FILE_OPENED protection:

        if (unlikely(error) && (*opened & FILE_OPENED))
                fput(file);

and path_openat() is again very careful to then use put_filp(file); if
FILE_OPENED was never set. And do_o_path() does the same.

I'm not seeing anybody who does the wrong thing, but there's a number
of ways to get this entirely wrong, and I worry some path does.

I would be a *lot* happier if we didn't have that very subtle
fput()-vs-put_filp() issue going on.

Again, I cannot see anything wrong, but this feels very very fragile to me.

               Linus