[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190328090045.GA22915@quack2.suse.cz>
Date: Thu, 28 Mar 2019 10:00:45 +0100
From: Jan Kara <jack@...e.cz>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Jan Kara <jack@...e.cz>, Mark Fasheh <mark@...heh.com>,
Dave Chinner <david@...morbit.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
syzbot <syzbot+7a8ba368b47fdefca61e@...kaller.appspotmail.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
Jaegeuk Kim <jaegeuk@...nel.org>,
Joel Becker <jlbec@...lplan.org>
Subject: Re: KASAN: use-after-free Read in path_lookupat
On Wed 27-03-19 18:59:48, Al Viro wrote:
> On Wed, Mar 27, 2019 at 05:58:31PM +0100, Jan Kara wrote:
> > On Tue 26-03-19 04:15:10, Al Viro wrote:
> > > On Mon, Mar 25, 2019 at 08:18:25PM -0700, Mark Fasheh wrote:
> > >
> > > > Hey Al,
> > > >
> > > > It's been a while since I've looked at that bit of code but it looks like
> > > > Ocfs2 is syncing the inode to disk and disposing of it's memory
> > > > representation (which would include the cluster locks held) so that other
> > > > nodes get a chance to delete the potentially orphaned inode. In Ocfs2 we
> > > > won't delete an inode if it exists in another nodes cache.
> > >
> > > Wait a sec - what's the reason for forcing that write_inode_now(); why
> > > doesn't the normal mechanism work? I'm afraid I still don't get it -
> > > we do wait for writeback in evict_inode(), or the local filesystems
> > > wouldn't work.
> >
> > I'm just guessing here but they don't want an inode cached once its last
> > dentry goes away (it makes cluster wide synchronization easier for them and
> > they do play tricks with cluster lock on dentries).
>
> Sure, but that's as simple as "return 1 from ->drop_inode()".
Right.
> > There is some info in
> > 513e2dae9422 "ocfs2: flush inode data to disk and free inode when i_count
> > becomes zero" which adds this ocfs2_drop_inode() implementation. So when
> > the last inode reference is dropped, they want to flush any dirty data to
> > disk and evict the inode. But AFAICT they should be fine with flushing the
> > inode from their ->evict_inode method. I_FREEING just stops the flusher
> > thread from touching the inode but explicit writeback through
> > write_inode_now(inode, 1) should go through just fine.
>
> Umm... Why is that write_inode_now() needed in either place? I agree that
> moving it to ->evict_inode() ought to be safe, but what makes it necessary
> in the first place? Put it another way, what dirties the data and/or
> metadata without marking it dirty?
Well, the inode & pages are marked dirty and they are dirty when we get to
iput_final(). But if ->drop_inode() returns 1 (which normally happens only
for unlinked files), we will not write out the inode in iput_final() and
the dirty data just gets discarded in ->evict_inode(). OCFS2 doesn't want
this so they have to write-out by hand.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists