[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161019171748.GO24393@dhcp22.suse.cz>
Date: Wed, 19 Oct 2016 19:17:49 +0200
From: Michal Hocko <mhocko@...nel.org>
To: Leon Yu <chianglungyu@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>,
Kees Cook <keescook@...omium.org>,
Oleg Nesterov <oleg@...hat.com>,
John Stultz <john.stultz@...aro.org>,
Mateusz Guzik <mguzik@...hat.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Janis Danisevskis <jdanis@...gle.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] proc: fix NULL dereference when reading /proc/<pid>/auxv
So here is my RFC as an alternative. Thoughts? Please note that we
currently have only very few users of use_mm() API in the kernel
so a risk of a regression is not really high. usb/gadget are using it
only temporarily. The remaining is vhost which operates on a remote mm
and I have no idea whether somebody might abuse /proc/vhost/mem or
anything - let's add Michael to the CC list. I am pretty sure nobody
abuse oom_reaper proc directory as this one is pretty new and such a
usage would be pretty much undefined as the reaper unmaps the address
space.
---
>From 6a1a9fca2871ada365b465382a3f89a1506c312d Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@...e.com>
Date: Wed, 19 Oct 2016 19:02:25 +0200
Subject: [PATCH] proc: do not allow to open file requiring mm for kernel
threads
Leon Yu has reported the following NULL ptr oops
$ cat /proc/2/auxv
[ 8.964445] Unable to handle kernel NULL pointer dereference at virtual address 000000a8
[ 8.972555] pgd = e99e0000
[ 8.975282] [000000a8] *pgd=399e6835, *pte=00000000, *ppte=00000000
[ 8.981572] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[ 8.986967] Modules linked in:
[ 8.990029] CPU: 3 PID: 113 Comm: cat Not tainted 4.9.0-rc1-ARCH+ #1
[ 8.996379] Hardware name: BCM2709
[ 8.999778] task: ea3b0b00 task.stack: e99b2000
[ 9.004314] PC is at auxv_read+0x24/0x4c
[ 9.008241] LR is at do_readv_writev+0x2fc/0x37c
[...]
[ 9.261895] [<b0135f80>] (auxv_read) from [<b00e5900>] (do_readv_writev+0x2fc/0x37c)
[ 9.269651] [<b00e5900>] (do_readv_writev) from [<b00e59c0>] (vfs_readv+0x40/0x58)
[ 9.277234] [<b00e59c0>] (vfs_readv) from [<b010eed4>] (default_file_splice_read+0x140/0x240)
[ 9.285769] [<b010eed4>] (default_file_splice_read) from [<b010eb4c>] (splice_direct_to_actor+0xa0/0x230)
[ 9.295345] [<b010eb4c>] (splice_direct_to_actor) from [<b010ed6c>] (do_splice_direct+0x90/0xb8)
[ 9.304140] [<b010ed6c>] (do_splice_direct) from [<b00e5e38>] (do_sendfile+0x1a0/0x308)
[ 9.319823] [<b00e67d4>] (SyS_sendfile64) from [<b000f300>] (ret_fast_syscall+0x0/0x34)
[ 9.327829] Code: e1a01002 e1a02003 e1a03004 e2833008 (e593e0a0)
[ 9.333973] ---[ end trace d3f50081f24b99ce ]---
This has been introduced by c5317167854e ("proc: switch auxv to use
of __mem_open()") but it shows a deeper problem we have had for a
longer time. __mem_open resp. proc_mem_open allows to open a file which
requires the address space even when there is none. This means that all
kernel code paths which use __mem_open have to check for mm!=NULL. This
is error prone as the above shows and also doesn't make much sense in
general. A task doesn't have an mm even when it is already exiting or
when it is a kernel thread. The later will not have an mm unless it
hijacks it from another task when the output might be really misleading
without a deeper knowledge of the particular kernel thread.
Chane proc_mem_open to disallow opening a file if the mm is NULL and
return ESRCH. This is an user visible change theoretically but every
user other than /proc/self/$file has to cope with ESRCH already and
relying on kthread giving any output is just an abuse of the interface.
This will make users f proc_mem_open less error prone as well.
Fixes: c5317167854e ("proc: switch auxv to use of __mem_open()")
Reported-by: Leon Yu <chianglungyu@...il.com>
Signed-off-by: Michal Hocko <mhocko@...e.com>
---
fs/proc/base.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 5ef2ae81f623..d34d33dbf1b2 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -804,10 +804,10 @@ static const struct file_operations proc_single_file_operations = {
struct mm_struct *proc_mem_open(struct inode *inode, unsigned int mode)
{
struct task_struct *task = get_proc_task(inode);
- struct mm_struct *mm = ERR_PTR(-ESRCH);
+ struct mm_struct *ret = ERR_PTR(-ESRCH);
if (task) {
- mm = mm_access(task, mode | PTRACE_MODE_FSCREDS);
+ struct mm_struct *mm = mm_access(task, mode | PTRACE_MODE_FSCREDS);
put_task_struct(task);
if (!IS_ERR_OR_NULL(mm)) {
@@ -815,10 +815,11 @@ struct mm_struct *proc_mem_open(struct inode *inode, unsigned int mode)
atomic_inc(&mm->mm_count);
/* but do not pin its memory */
mmput(mm);
+ ret = mm;
}
}
- return mm;
+ return ret;
}
static int __mem_open(struct inode *inode, struct file *file, unsigned int mode)
--
2.9.3
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists