lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiijJMTw=nTj_ED+YWMCrvHzh3ezfVYRxvzcW6+trgyPA@mail.gmail.com>
Date:   Sun, 24 Mar 2019 18:23:24 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     syzbot <syzbot+7a8ba368b47fdefca61e@...kaller.appspotmail.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>
Cc:     linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Al Viro <viro@...iv.linux.org.uk>
Subject: Re: KASAN: use-after-free Read in path_lookupat

Hmm.

Al, this one seems real and also seems pretty nasty from a vfs
interface standpoint.

On Wed, Nov 28, 2018 at 9:40 AM syzbot
<syzbot+7a8ba368b47fdefca61e@...kaller.appspotmail.com> wrote:
>
> BUG: KASAN: use-after-free in lookup_last fs/namei.c:2269 [inline]
> BUG: KASAN: use-after-free in path_lookupat.isra.43+0x9f8/0xc00  fs/namei.c:2318
> ...
>   lookup_last fs/namei.c:2269 [inline]
>   path_lookupat.isra.43+0x9f8/0xc00 fs/namei.c:2318

Ok, path lookup using RCU.

> Allocated by task 9424:
>   kstrdup+0x39/0x70 mm/util.c:49
>   bpf_symlink+0x26/0x140 kernel/bpf/inode.c:356

It's the symlink data for the bpf filesystem. Fair enough.

> Freed by task 9425:
>   kfree+0xcf/0x230 mm/slab.c:3817
>   bpf_evict_inode+0x11f/0x150 kernel/bpf/inode.c:565
>   evict+0x4b9/0x980 fs/inode.c:558
>   iput_final fs/inode.c:1550 [inline]
>   iput+0x674/0xa90 fs/inode.c:1576
>   do_unlinkat+0x733/0xa30 fs/namei.c:4069

.. and the path lookup is racing with the final unlink.

We actually RCU-delay the inode freeing itself, but when we do the
final iput(), the "evict()" function is called synchronously.

Now, the simple fix would seem to just RCU-delay the kfree() of the
symlink data in bpf_evict_inode(). Maybe that's the right thing to do.
I'd call it trivial, except you'd need to have that rcu head to attach
it to, making it slightly less trivial. I guess the kmalloc could just
malloc that too.

But it does strike me that this might be a generic issue. I wonder how
many other filesystems do this?

Alexei, Daniel - the "proper" fix right is probably one of four cases:

 (a) rcu-delay the freeing, and use simple_symlink_inode_operations().
No set_delayed_call() needed, because it stays around for the inode
lifetime, but the RCU-delaying is still needed for lookup.

or

 (b) put the symlink in a page, and use page_symlink_inode_operations
for that inode, where we have a "struct delayed_call" and get a page
ref

or

 (c) put the symlink in the inode itself, and then use that
simple_symlink_inode_operations (and now the RCU-delaying of the inode
protects it).

of

 (d) make a special getlink() that does "copy symlink,
set_delayed_call(done, kfree_link, copy)"

but I do wonder if we should perhaps just make
simple_symlink_inode_operations do that (d) case for people.

Al, comments? At the very least, if we don't make
simple_symlink_inode_operations() do that, we should have a *big*
comment that if it's not part of the inode data, it needs to be
RCU-delayed.

Or maybe we could add a final inode callback function for "rcu_free"
that is called as the RCU-delayed freeing of the inode itself happens?
And then people could hook into that for freeing the inode->i_link
data.

So many choices.. But the current situation seems unnecessarily
complex for the filesystem, and isn't really documented.

Our documentation currently says for get_link(): "If the body won't go
away until the inode is gone, nothing else is needed", which is wrong
(or at least very misleading, since the last "inode is gone" callback
we have is that evict() function).

               Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ