[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YGs4clcRhyoXX8D0@zeniv-ca.linux.org.uk>
Date: Mon, 5 Apr 2021 16:18:58 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Christian Brauner <christian.brauner@...ntu.com>
Cc: Jens Axboe <axboe@...nel.dk>,
syzbot <syzbot+c88a7030da47945a3cc3@...kaller.appspotmail.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
syzkaller-bugs@...glegroups.com, io-uring@...r.kernel.org
Subject: Re: [syzbot] WARNING in mntput_no_expire (2)
On Mon, Apr 05, 2021 at 01:44:37PM +0200, Christian Brauner wrote:
> On Sun, Apr 04, 2021 at 08:17:21PM +0000, Al Viro wrote:
> > On Sun, Apr 04, 2021 at 06:50:10PM +0000, Al Viro wrote:
> >
> > > > Yeah, I have at least namei.o
> > > >
> > > > https://drive.google.com/file/d/1AvO1St0YltIrA86DXjp1Xg3ojtS9owGh/view?usp=sharing
> > >
> > > *grumble*
> > >
> > > Is it reproducible without KASAN? Would be much easier to follow the produced
> > > asm...
> >
> > Looks like inode_permission(_, NULL, _) from may_lookup(nd). I.e.
> > nd->inode == NULL.
>
> Yeah, I already saw that.
>
> >
> > Mind slapping BUG_ON(!nd->inode) right before may_lookup() call in
> > link_path_walk() and trying to reproduce that oops?
>
> Yep, no problem. If you run the reproducer in a loop for a little while
> you eventually trigger the BUG_ON() and then you get the following splat
> (and then an endless loop) in [1] with nd->inode NULL.
>
> _But_ I managed to debug this further and was able to trigger the BUG_ON()
> directly in path_init() in the AT_FDCWD branch (after all its AT_FDCWD(./file0)
> with the patch in [3] (it's in LOOKUP_RCU) the corresponding splat is in [2].
> So the crash happens for a PF_IO_WORKER thread with a NULL nd->inode for the
> PF_IO_WORKER's pwd (The PF_IO_WORKER seems to be in async context.).
So we find current->fs->pwd.dentry negative, with current->fs->seq sampled
equal before and after that? Lovely... The only places where we assign
anything to ->pwd.dentry are
void set_fs_pwd(struct fs_struct *fs, const struct path *path)
{
struct path old_pwd;
path_get(path);
spin_lock(&fs->lock);
write_seqcount_begin(&fs->seq);
old_pwd = fs->pwd;
fs->pwd = *path;
write_seqcount_end(&fs->seq);
spin_unlock(&fs->lock);
if (old_pwd.dentry)
path_put(&old_pwd);
}
where we have ->seq bumped between dget new/assignment/ dput old,
copy_fs_struct() where we have
spin_lock(&old->lock);
fs->root = old->root;
path_get(&fs->root);
fs->pwd = old->pwd;
path_get(&fs->pwd);
spin_unlock(&old->lock);
fs being freshly allocated instance that couldn't have been observed
by anyone and chroot_fs_refs(), where we have
spin_lock(&fs->lock);
write_seqcount_begin(&fs->seq);
hits += replace_path(&fs->root, old_root, new_root);
hits += replace_path(&fs->pwd, old_root, new_root);
write_seqcount_end(&fs->seq);
while (hits--) {
count++;
path_get(new_root);
}
spin_unlock(&fs->lock);
...
static inline int replace_path(struct path *p, const struct path *old, const struct path *new)
{
if (likely(p->dentry != old->dentry || p->mnt != old->mnt))
return 0;
*p = *new;
return 1;
}
Here we have new_root->dentry pinned from the very beginning,
and assignments are wrapped into bumps of ->seq. Moreover,
we are holding ->lock through that sequence (as all writers
do), so these references can't be dropped before path_get()
bumps new_root->dentry refcount.
chroot_fs_refs() is called only by pivot_root(2):
chroot_fs_refs(&root, &new);
and there new is set by
error = user_path_at(AT_FDCWD, new_root,
LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &new);
if (error)
goto out0;
which pins new.dentry *and* verifies that it's positive and a directory,
at that. Since pinned positive dentry can't be made negative by anybody
else, we know it will remain in that state until
path_put(&new);
well downstream of chroot_fs_refs(). In copy_fs_struct() we are
copying someone's ->pwd, so it's also pinned positive. And it
won't be dropped outside of old->lock, so by the time somebody
manages to drop the reference in old, path_get() effects will be
visible (old->lock serving as a barrier).
That leaves set_fs_pwd() calls:
fs/init.c:54: set_fs_pwd(current->fs, &path);
init_chdir(), path set by LOOKUP_DIRECTORY patwalk. Pinned positive.
fs/namespace.c:4207: set_fs_pwd(current->fs, &root);
init_mount_tree(), root.dentry being ->mnt_root of rootfs. Pinned
positive (and it would've oopsed much earlier had that been it)
fs/namespace.c:4485: set_fs_pwd(fs, &root);
mntns_install(), root filled by successful LOOKUP_DOWN for "/"
from mnt_ns->root. Should be pinned positive.
fs/open.c:501: set_fs_pwd(current->fs, &path);
chdir(2), path set by LOOKUP_DIRECTORY pathwalk. Pinned positive.
fs/open.c:528: set_fs_pwd(current->fs, &f.file->f_path);
fchdir(2), file->f_path of any opened file. Pinned positive.
kernel/usermode_driver.c:130: set_fs_pwd(current->fs, &umd_info->wd);
umd_setup(), ->wd.dentry equal to ->wd.mnt->mnt_root, should be pinned positive.
kernel/nsproxy.c:509: set_fs_pwd(me->fs, &nsset->fs->pwd);
commit_nsset(). Let's see what's going on there...
if ((flags & CLONE_NEWNS) && (flags & ~CLONE_NEWNS)) {
set_fs_root(me->fs, &nsset->fs->root);
set_fs_pwd(me->fs, &nsset->fs->pwd);
}
In those conditions nsset.fs has come from copy_fs_struct() done in
prepare_nsset(). And the only thing that might've been done to it
would be those set_fs_pwd() in mntns_install() (I'm not fond of the
entire nsset->fs thing - looks like papering over bad calling
conventions, but anyway)
Now, I might've missed some insanity (direct assignments to ->pwd.dentry,
etc. - wouldn't be the first time io_uring folks went "layering? wassat?
we'll just poke in whatever we can reach"), but I don't see anything
obvious of that sort in the area...
OK, how about this: in path_init(), right after
do {
seq = read_seqcount_begin(&fs->seq);
nd->path = fs->pwd;
nd->inode = nd->path.dentry->d_inode;
nd->seq = __read_seqcount_begin(&nd->path.dentry->d_seq);
} while (read_seqcount_retry(&fs->seq, seq));
slap
if (!nd->inode) {
// should never, ever happen
struct dentry *fucked = nd->path.dentry;
printk(KERN_ERR "%pd4 %d %x %p %d %d", fucked, d_count(fucked),
fucked->d_flags, fs, fs->users, seq);
BUG_ON(1);
return ERR_PTR(-EINVAL);
}
and see what it catches?
Powered by blists - more mailing lists