linux-kernel - Re: KASAN: use-after-free Read in path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190325230211.GR2217@ZenIV.linux.org.uk>
Date:   Mon, 25 Mar 2019 23:02:11 +0000
From:   Al Viro <viro@...iv.linux.org.uk>
To:     Dave Chinner <david@...morbit.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        syzbot <syzbot+7a8ba368b47fdefca61e@...kaller.appspotmail.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Jan Kara <jack@...e.cz>, Jaegeuk Kim <jaegeuk@...nel.org>,
        Joel Becker <jlbec@...lplan.org>, Mark Fasheh <mark@...heh.com>
Subject: Re: KASAN: use-after-free Read in path_lookupat

On Tue, Mar 26, 2019 at 09:48:23AM +1100, Dave Chinner wrote:

> And when it comes to VFS inode reclaim, XFS does not implement
> ->evict_inode because there is nothing at the VFS level to do.
> And ->destroy_inode ends up doing cleanup work (e.g. freeing on-disk
> inodes) which is non-trivial, blocking work, but then still requires
> the struct xfs_inode to be written back to disk before it can bei
> freed. So it just gets marked "reclaimable" and background reclaim
> then takes care of it from there so we avoid synchronous IO in inode
> reclaim...
> 
> This works because don't track dirty inode metadata in the VFS
> writeback code (it's tracked with much more precision in the XFS log
> infrastructure) and we don't write back inodes from the VFS
> infrastructure, either. It's all done based on internal state
> outside the VFS.
> 
> And, because of this, the VFS cannot assume that it can free
> the struct inode after calling ->destroy_inode or even use
> call_rcu() to run a filesystem destructor because the filesystem
> may need to do work that needs to block and that's not allowed in an
> RCU callback...

In Linus' patch that's what you get with non-NULL ->destroy_inode
+ NULL ->destroy_inode_rcu, so XFS won't be screwed by that.
Said that, yes, XFS adds another fun twist there (AFAICS, it's
the only in-tree filesystem that pulls that off).

I would really like some comments from f2fs and ocfs2 folks, as well
as Jan - he's had much more recent contact with writeback code than
I have...  Could somebody explain what's going on in f2fs and ocfs2
->drop_inode()?  It _should_ be just a predicate; looks like both
are playing very odd games to work around writeback problems and
I wonder if there's a cleaner solution for that.  I can try and dig
through maillist(s) archives, but I would really appreciate it
if somebody could give a braindump on the issues dealt with in there...