[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171123122616.GC6324@quack2.suse.cz>
Date: Thu, 23 Nov 2017 13:26:16 +0100
From: Jan Kara <jack@...e.cz>
To: Michal Hocko <mhocko@...nel.org>
Cc: Al Viro <viro@...iv.linux.org.uk>,
Dave Chinner <david@...morbit.com>, Jan Kara <jack@...e.cz>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
LKML <linux-kernel@...r.kernel.org>,
linux-fsdevel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...e.com>
Subject: Re: [PATCH] fs: handle shrinker registration failure in sget_userns
On Thu 23-11-17 12:52:47, Michal Hocko wrote:
> From: Michal Hocko <mhocko@...e.com>
>
> Syzbot has reported NULL ptr dereference during mntput because of
> sb shrinker being NULL
> CPU: 1 PID: 13231 Comm: syz-executor1 Not tainted 4.14.0-rc8+ #82
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> task: ffff8801d1dbe5c0 task.stack: ffff8801c9e38000
> RIP: 0010:__list_del_entry_valid+0x7e/0x150 lib/list_debug.c:51
> RSP: 0018:ffff8801c9e3f108 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffff8801c53c6f98 RDI: ffff8801c53c6fa0
> RBP: ffff8801c9e3f120 R08: 1ffff100393c7d55 R09: 0000000000000004
> R10: ffff8801c9e3ef70 R11: 0000000000000000 R12: 0000000000000000
> R13: dffffc0000000000 R14: 1ffff100393c7e45 R15: ffff8801c53c6f98
> FS: 0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
> CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
> CR2: 00000000dbc23000 CR3: 00000001c7269000 CR4: 00000000001406e0
> DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Call Trace:
> __list_del_entry include/linux/list.h:117 [inline]
> list_del include/linux/list.h:125 [inline]
> unregister_shrinker+0x79/0x300 mm/vmscan.c:301
> deactivate_locked_super+0x64/0xd0 fs/super.c:308
> deactivate_super+0x141/0x1b0 fs/super.c:340
> cleanup_mnt+0xb2/0x150 fs/namespace.c:1173
> mntput_no_expire+0x6e0/0xa90 fs/namespace.c:1237
> mntput fs/namespace.c:1247 [inline]
> kern_unmount+0x9c/0xd0 fs/namespace.c:2999
> mq_put_mnt+0x37/0x50 ipc/mqueue.c:1609
> put_ipc_ns+0x4d/0x150 ipc/namespace.c:163
> free_nsproxy+0xc0/0x1f0 kernel/nsproxy.c:180
> switch_task_namespaces+0x9d/0xc0 kernel/nsproxy.c:229
> exit_task_namespaces+0x17/0x20 kernel/nsproxy.c:234
> do_exit+0x9b0/0x1ad0 kernel/exit.c:864
> do_group_exit+0x149/0x400 kernel/exit.c:968
>
> Tetsuo has properly pointed out that the real reason is that fault
> injection has caused register_shrinker to fail and the error path is not
> handled in sget_userns.
>
> Fix the issue by moving the shrinker registration up when the superblock
> is allocated and fail early even before we try to register the superblock.
> This should be safe wrt. parallel shrinker registration as we are holding
> s_umount lock which blocks shrinker invocation.
>
> The issue is very unlikely to trigger in the production because small
> allocations do not fail usually.
>
> Debugged-by: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
> Signed-off-by: Michal Hocko <mhocko@...e.com>
> ---
>
> Hi,
> I am not really familiar with the code but something like this should
> work. There are other shrinker users which need a similar treatment
> of course. They can be handled separately. The failure path is rather
> unlikely to lose sleep over.
>
> fs/super.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/fs/super.c b/fs/super.c
> index d4e33e8f1e6f..cfeb0e12b193 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -489,6 +489,7 @@ struct super_block *sget_userns(struct file_system_type *type,
> continue;
> if (user_ns != old->s_user_ns) {
> spin_unlock(&sb_lock);
> + unregister_shrinker(&s->s_shrink);
This is wrong as 's' can be NULL at this point. I think the right fix is to
move unregister_shrinker() into destroy_unused_super(). But for that we
need a reliable way to detect whether the shrinker has been already
registered - possibly by initializing sb->shrinker.list in alloc_super()
and then checking for list_empty() in destroy_unused_super().
Also I'd note that early shrinker registration breaks assumption of
destroy_unused_super() that nobody could have seen the superblock -
shrinkers could have - but since shrinker code doesn't use RCU to access
the superblock, we are fine. But still comment before
destroy_unused_super() should be probably updated.
Honza
> destroy_unused_super(s);
> return ERR_PTR(-EBUSY);
> }
> @@ -503,12 +504,18 @@ struct super_block *sget_userns(struct file_system_type *type,
> s = alloc_super(type, (flags & ~SB_SUBMOUNT), user_ns);
> if (!s)
> return ERR_PTR(-ENOMEM);
> + if (register_shrinker(&s->s_shrink)) {
> + spin_unlock(&sb_lock);
> + destroy_unused_super(s);
> + return ERR_PTR(-ENOMEM);
> + }
> goto retry;
> }
>
> err = set(s, data);
> if (err) {
> spin_unlock(&sb_lock);
> + unregister_shrinker(&s->s_shrink);
> destroy_unused_super(s);
> return ERR_PTR(err);
> }
> @@ -518,7 +525,6 @@ struct super_block *sget_userns(struct file_system_type *type,
> hlist_add_head(&s->s_instances, &type->fs_supers);
> spin_unlock(&sb_lock);
> get_filesystem(type);
> - register_shrinker(&s->s_shrink);
> return s;
> }
>
> --
> 2.15.0
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists