lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 1 Oct 2015 11:37:57 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Kees Cook <keescook@...omium.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Andy Lutomirski <luto@...capital.net>,
	Dmitry Vyukov <dvyukov@...gle.com>,
	Andrey Ryabinin <ryabinin.a.a@...il.com>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andy Lutomirski <luto@...nel.org>,
	Borislav Petkov <bp@...en8.de>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	"x86@...nel.org" <x86@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Andrey Konovalov <andreyknvl@...gle.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	Andi Kleen <ak@...ux.intel.com>,
	kasan-dev <kasan-dev@...glegroups.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Al Viro <viro@...iv.linux.org.uk>
Subject: [PATCH v4] fs/proc, core/debug: Don't expose absolute kernel
 addresses via wchan


* Ingo Molnar <mingo@...nel.org> wrote:

> 
> * Kees Cook <keescook@...omium.org> wrote:
> 
> > > @@ -507,7 +505,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
> > >         seq_put_decimal_ull(m, ' ', task->blocked.sig[0] & 0x7fffffffUL);
> > >         seq_put_decimal_ull(m, ' ', sigign.sig[0] & 0x7fffffffUL);
> > >         seq_put_decimal_ull(m, ' ', sigcatch.sig[0] & 0x7fffffffUL);
> > > -       seq_put_decimal_ull(m, ' ', wchan);
> > > +       seq_puts(m, " 0"); /* Used to be numeric wchan - replaced by /proc/PID/wchan */
> > 
> > Probably should also update Documentation/filesystems/proc.txt with
> > something like:
> > 
> > --- a/Documentation/filesystems/proc.txt
> > +++ b/Documentation/filesystems/proc.txt
> > @@ -310,7 +310,7 @@ Table 1-4: Contents of the stat files (as of 2.6.30-rc7)
> >    blocked       bitmap of blocked signals
> >    sigign        bitmap of ignored signals
> >    sigcatch      bitmap of caught signals
> > -  wchan         address where process went to sleep
> > +  0             (place holder, was wchan, see /proc/PID/wchan instead)
> >    0             (place holder)
> >    0             (place holder)
> >    exit_signal   signal to send to parent thread on exit
> 
> Indeed - I ended up clarifying both wchan explanations, see the changes below.
> 
> I also made the 'no symbols' output "0" (instead of an empty string), to better 
> match the /proc/PID/stat behavior and previous output.
> 
> I'll push it out after a bit more testing and if nothing goes wrong I'll send this 
> patch to Linus in the v4.4 merge window.

Yeah, so testing uncovered the following additional ABI detail: procps relies on 
the wchan field in /proc/PID/stat, but only as a flag (in most cases), whether to 
look at /proc/PID/wchan.

To keep the ABI, the v4 patch below outputs not the absolute address, but a 0/1 
flag to indicate whether the task is blocked and whether there's anything worth 
looking at in /proc/PID/wchan.

I tested this approach with procps and it seems to fully work. In fact due to the 
ptrace check we properly restrict the information to our own tasks only. root 
still sees the wchan field of all tasks.

Btw., the very latest procps-ng grew this nice change:

  6b8dc5511fb9 ("library: refactor and rely on modern kernels for wchan")

which greatly simplified procps's handling of /proc/PID/wchan.

... but my testing was done with an older procps version.

Thanks,

	Ingo

==========================>
>From b26a16469b0b6f3f0604aafb95c50d1532b3fff2 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@...nel.org>
Date: Wed, 30 Sep 2015 15:59:17 +0200
Subject: [PATCH] fs/proc, core/debug: Don't expose absolute kernel addresses via wchan

So the /proc/PID/stat 'wchan' field (the 30th field, which contains
the absolute kernel address of the kernel function a task is blocked in)
leaks absolute kernel addresses to unprivileged user-space:

        seq_put_decimal_ull(m, ' ', wchan);

The absolute address might also leak via /proc/PID/wchan as well, if
KALLSYMS is turned off or if the symbol lookup fails for some reason:

static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns,
                          struct pid *pid, struct task_struct *task)
{
        unsigned long wchan;
        char symname[KSYM_NAME_LEN];

        wchan = get_wchan(task);

        if (lookup_symbol_name(wchan, symname) < 0) {
                if (!ptrace_may_access(task, PTRACE_MODE_READ))
                        return 0;
                seq_printf(m, "%lu", wchan);
        } else {
                seq_printf(m, "%s", symname);
        }

        return 0;
}

This isn't ideal, because for example it trivially leaks the KASLR offset
to any local attacker:

  fomalhaut:~> printf "%016lx\n" $(cat /proc/$$/stat | cut -d' ' -f35)
  ffffffff8123b380

Most real-life uses of wchan are symbolic:

  ps -eo pid:10,tid:10,wchan:30,comm

and procps uses /proc/PID/wchan, not the absolute address in /proc/PID/stat:

  triton:~/tip> strace -f ps -eo pid:10,tid:10,wchan:30,comm 2>&1 | grep wchan | tail -1
  open("/proc/30833/wchan", O_RDONLY)     = 6

There's one compatibility quirk here: procps relies on whether the
absolute value is non-zero - and we can provide that functionality
by outputing "0" or "1" depending on whether the task is blocked
(whether there's a wchan address).

These days there appears to be very little legitimate reason
user-space would be interested in  the absolute address. The
absolute address is mostly historic: from the days when we
didn't have kallsyms and user-space procps had to do the
decoding itself via the System.map.

So this patch sets all numeric output to "0" or "1" and keeps only
symbolic output, in /proc/PID/wchan.

( The absolute sleep address can generally still be profiled via
  perf, by tasks with sufficient privileges. )

Reviewed-by: Thomas Gleixner <tglx@...utronix.de>
Acked-by: Kees Cook <keescook@...omium.org>
Acked-by: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: <stable@...r.kernel.org>
Cc: Al Viro <viro@...iv.linux.org.uk>
Cc: Alexander Potapenko <glider@...gle.com>
Cc: Andrey Konovalov <andreyknvl@...gle.com>
Cc: Andrey Ryabinin <ryabinin.a.a@...il.com>
Cc: Andy Lutomirski <luto@...capital.net>
Cc: Andy Lutomirski <luto@...nel.org>
Cc: Borislav Petkov <bp@...en8.de>
Cc: Denys Vlasenko <dvlasenk@...hat.com>
Cc: Dmitry Vyukov <dvyukov@...gle.com>
Cc: Kostya Serebryany <kcc@...gle.com>
Cc: Mike Galbraith <efault@....de>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Sasha Levin <sasha.levin@...cle.com>
Cc: kasan-dev <kasan-dev@...glegroups.com>
Cc: linux-kernel@...r.kernel.org
Link: http://lkml.kernel.org/r/20150930135917.GA3285@gmail.com
Signed-off-by: Ingo Molnar <mingo@...nel.org>
---
 Documentation/filesystems/proc.txt |  5 +++--
 fs/proc/array.c                    | 16 ++++++++++++++--
 fs/proc/base.c                     |  9 +++------
 3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index d411ca63c8b6..3a9d65c912e7 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -140,7 +140,8 @@ Table 1-1: Process specific entries in /proc
  stat		Process status
  statm		Process memory status information
  status		Process status in human readable form
- wchan		If CONFIG_KALLSYMS is set, a pre-decoded wchan
+ wchan		Present with CONFIG_KALLSYMS=y: it shows the kernel function
+		symbol the task is blocked in - or "0" if not blocked.
  pagemap	Page table
  stack		Report full stack trace, enable via CONFIG_STACKTRACE
  smaps		a extension based on maps, showing the memory consumption of
@@ -310,7 +311,7 @@ Table 1-4: Contents of the stat files (as of 2.6.30-rc7)
   blocked       bitmap of blocked signals
   sigign        bitmap of ignored signals
   sigcatch      bitmap of caught signals
-  wchan         address where process went to sleep
+  0		(place holder, used to be the wchan address, use /proc/PID/wchan instead)
   0             (place holder)
   0             (place holder)
   exit_signal   signal to send to parent thread on exit
diff --git a/fs/proc/array.c b/fs/proc/array.c
index f60f0121e331..eed2050db9be 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -375,7 +375,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
 static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 			struct pid *pid, struct task_struct *task, int whole)
 {
-	unsigned long vsize, eip, esp, wchan = ~0UL;
+	unsigned long vsize, eip, esp, wchan = 0;
 	int priority, nice;
 	int tty_pgrp = -1, tty_nr = 0;
 	sigset_t sigign, sigcatch;
@@ -507,7 +507,19 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	seq_put_decimal_ull(m, ' ', task->blocked.sig[0] & 0x7fffffffUL);
 	seq_put_decimal_ull(m, ' ', sigign.sig[0] & 0x7fffffffUL);
 	seq_put_decimal_ull(m, ' ', sigcatch.sig[0] & 0x7fffffffUL);
-	seq_put_decimal_ull(m, ' ', wchan);
+
+	/*
+	 * We used to output the absolute kernel address, but that's an
+	 * information leak - so instead we show a 0/1 flag here, to signal
+	 * to user-space whether there's a wchan field in /proc/PID/wchan.
+	 *
+	 * This works with older implementations of procps as well.
+	 */
+	if (wchan)
+		seq_puts(m, " 1");
+	else
+		seq_puts(m, " 0");
+
 	seq_put_decimal_ull(m, ' ', 0);
 	seq_put_decimal_ull(m, ' ', 0);
 	seq_put_decimal_ll(m, ' ', task->exit_signal);
diff --git a/fs/proc/base.c b/fs/proc/base.c
index b25eee4cead5..6f05aabce3aa 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -430,13 +430,10 @@ static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns,
 
 	wchan = get_wchan(task);
 
-	if (lookup_symbol_name(wchan, symname) < 0) {
-		if (!ptrace_may_access(task, PTRACE_MODE_READ))
-			return 0;
-		seq_printf(m, "%lu", wchan);
-	} else {
+	if (!lookup_symbol_name(wchan, symname))
 		seq_printf(m, "%s", symname);
-	}
+	else
+		seq_putc(m, '0');
 
 	return 0;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ