[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171123124541.sjdkavie47wfahrs@dhcp22.suse.cz>
Date: Thu, 23 Nov 2017 13:45:41 +0100
From: Michal Hocko <mhocko@...nel.org>
To: Jan Kara <jack@...e.cz>
Cc: Al Viro <viro@...iv.linux.org.uk>,
Dave Chinner <david@...morbit.com>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
LKML <linux-kernel@...r.kernel.org>,
linux-fsdevel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] fs: handle shrinker registration failure in sget_userns
On Thu 23-11-17 13:26:16, Jan Kara wrote:
> On Thu 23-11-17 12:52:47, Michal Hocko wrote:
[...]
> > @@ -489,6 +489,7 @@ struct super_block *sget_userns(struct file_system_type *type,
> > continue;
> > if (user_ns != old->s_user_ns) {
> > spin_unlock(&sb_lock);
> > + unregister_shrinker(&s->s_shrink);
>
> This is wrong as 's' can be NULL at this point.
Ohh, I've seen destroy_unused_super(s) and thought it operates on
non-NULL s. My bad.
> I think the right fix is to
> move unregister_shrinker() into destroy_unused_super(). But for that we
> need a reliable way to detect whether the shrinker has been already
> registered - possibly by initializing sb->shrinker.list in alloc_super()
> and then checking for list_empty() in destroy_unused_super().
Yeah, that makes sense.
> Also I'd note that early shrinker registration breaks assumption of
> destroy_unused_super() that nobody could have seen the superblock -
> shrinkers could have - but since shrinker code doesn't use RCU to access
> the superblock, we are fine. But still comment before
> destroy_unused_super() should be probably updated.
Right. Thanks a lot for the review Jan!
What about the following?
---
>From cffea62e7f8605c370c8115afae530a1831e75f3 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@...e.com>
Date: Thu, 23 Nov 2017 12:28:35 +0100
Subject: [PATCH] fs: handle shrinker registration failure in sget_userns
Syzbot has reported NULL ptr dereference during mntput because of
sb shrinker being NULL
CPU: 1 PID: 13231 Comm: syz-executor1 Not tainted 4.14.0-rc8+ #82
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
task: ffff8801d1dbe5c0 task.stack: ffff8801c9e38000
RIP: 0010:__list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51
RSP: 0018:ffff8801c9e3f108 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8801c53c6f98 RDI: ffff8801c53c6fa0
RBP: ffff8801c9e3f120 R08: 1ffff100393c7d55 R09: 0000000000000004
R10: ffff8801c9e3ef70 R11: 0000000000000000 R12: 0000000000000000
R13: dffffc0000000000 R14: 1ffff100393c7e45 R15: ffff8801c53c6f98
FS: 0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 00000000dbc23000 CR3: 00000001c7269000 CR4: 00000000001406e0
DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Call Trace:
__list_del_entry include/linux/list.h:117 [inline]
list_del include/linux/list.h:125 [inline]
unregister_shrinker+0x79/0x300 mm/vmscan.c:301
deactivate_locked_super+0x64/0xd0 fs/super.c:308
deactivate_super+0x141/0x1b0 fs/super.c:340
cleanup_mnt+0xb2/0x150 fs/namespace.c:1173
mntput_no_expire+0x6e0/0xa90 fs/namespace.c:1237
mntput fs/namespace.c:1247 [inline]
kern_unmount+0x9c/0xd0 fs/namespace.c:2999
mq_put_mnt+0x37/0x50 ipc/mqueue.c:1609
put_ipc_ns+0x4d/0x150 ipc/namespace.c:163
free_nsproxy+0xc0/0x1f0 kernel/nsproxy.c:180
switch_task_namespaces+0x9d/0xc0 kernel/nsproxy.c:229
exit_task_namespaces+0x17/0x20 kernel/nsproxy.c:234
do_exit+0x9b0/0x1ad0 kernel/exit.c:864
do_group_exit+0x149/0x400 kernel/exit.c:968
Tetsuo has properly pointed out that the real reason is that fault
injection has caused register_shrinker to fail and the error path is not
handled in sget_userns.
Fix the issue by moving the shrinker registration up when the superblock
is allocated and fail early even before we try to register the superblock.
This should be safe wrt. parallel shrinker invocation as we are holding
s_umount lock which blocks shrinker invocation.
The issue is very unlikely to trigger in the production because small
allocations do not fail usually.
Debugged-by: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Signed-off-by: Michal Hocko <mhocko@...e.com>
---
fs/super.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/fs/super.c b/fs/super.c
index d4e33e8f1e6f..a306b5fef1ea 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -155,11 +155,19 @@ static void destroy_super_rcu(struct rcu_head *head)
schedule_work(&s->destroy_work);
}
-/* Free a superblock that has never been seen by anyone */
+/*
+ * Free a superblock that has never been seen by anyone. Note that shrinkers
+ * could have been invoked already but we rely on s_umount to not actually
+ * touch it.
+ */
static void destroy_unused_super(struct super_block *s)
{
if (!s)
return;
+
+ if (!list_empty(&s->s_shrink.list))
+ unregister_shrinker(&s->s_shrink);
+
up_write(&s->s_umount);
list_lru_destroy(&s->s_dentry_lru);
list_lru_destroy(&s->s_inode_lru);
@@ -252,6 +260,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
s->s_shrink.count_objects = super_cache_count;
s->s_shrink.batch = 1024;
s->s_shrink.flags = SHRINKER_NUMA_AWARE | SHRINKER_MEMCG_AWARE;
+ INIT_LIST_HEAD(&s->s_shrink.list);
return s;
fail:
@@ -503,6 +512,11 @@ struct super_block *sget_userns(struct file_system_type *type,
s = alloc_super(type, (flags & ~SB_SUBMOUNT), user_ns);
if (!s)
return ERR_PTR(-ENOMEM);
+ if (register_shrinker(&s->s_shrink)) {
+ spin_unlock(&sb_lock);
+ destroy_unused_super(s);
+ return ERR_PTR(-ENOMEM);
+ }
goto retry;
}
@@ -518,7 +532,6 @@ struct super_block *sget_userns(struct file_system_type *type,
hlist_add_head(&s->s_instances, &type->fs_supers);
spin_unlock(&sb_lock);
get_filesystem(type);
- register_shrinker(&s->s_shrink);
return s;
}
--
2.15.0
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists