lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170307131649.GA3358@twins.programming.kicks-ass.net>
Date:   Tue, 7 Mar 2017 14:16:49 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Dmitry Vyukov <dvyukov@...gle.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        syzkaller <syzkaller@...glegroups.com>,
        Oleg Nesterov <oleg@...hat.com>
Subject: Re: perf: use-after-free in perf_release

On Mon, Mar 06, 2017 at 02:14:59PM +0100, Peter Zijlstra wrote:
> On Mon, Mar 06, 2017 at 10:57:07AM +0100, Dmitry Vyukov wrote:
> 
> > ==================================================================
> > BUG: KASAN: use-after-free in atomic_dec_and_test
> > arch/x86/include/asm/atomic.h:123 [inline] at addr ffff880079c30158
> > BUG: KASAN: use-after-free in put_task_struct
> > include/linux/sched/task.h:93 [inline] at addr ffff880079c30158
> > BUG: KASAN: use-after-free in put_ctx+0xcf/0x110
> 
> FWIW, this output is very confusing, is this a result of your
> post-processing replicating the line for every 'inlined' part?
> 
> > kernel/events/core.c:1131 at addr ffff880079c30158
> > Write of size 4 by task syz-executor6/25698
> 
> >  atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
> >  put_task_struct include/linux/sched/task.h:93 [inline]
> >  put_ctx+0xcf/0x110 kernel/events/core.c:1131
> >  perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
> >  perf_release+0x37/0x50 kernel/events/core.c:4338
> >  __fput+0x332/0x800 fs/file_table.c:209
> >  ____fput+0x15/0x20 fs/file_table.c:245
> >  task_work_run+0x197/0x260 kernel/task_work.c:116
> >  exit_task_work include/linux/task_work.h:21 [inline]
> >  do_exit+0xb38/0x29c0 kernel/exit.c:880
> >  do_group_exit+0x149/0x420 kernel/exit.c:984
> >  get_signal+0x7e0/0x1820 kernel/signal.c:2318
> >  do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
> >  exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
> >  syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
> >  do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
> >  entry_SYSCALL64_slow_path+0x25/0x25
> 
> So this is fput()..
> 
> 
> > Freed:
> > PID = 25681
> >  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
> >  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> >  set_track mm/kasan/kasan.c:525 [inline]
> >  kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:589
> >  __cache_free mm/slab.c:3514 [inline]
> >  kmem_cache_free+0x71/0x240 mm/slab.c:3774
> >  free_task_struct kernel/fork.c:158 [inline]
> >  free_task+0x151/0x1d0 kernel/fork.c:370
> >  copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
> >  copy_process kernel/fork.c:1531 [inline]
> >  _do_fork+0x200/0x1010 kernel/fork.c:1994
> >  SYSC_clone kernel/fork.c:2104 [inline]
> >  SyS_clone+0x37/0x50 kernel/fork.c:2098
> >  do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
> >  return_from_SYSCALL_64+0x0/0x7a
> 
> and this is a failed fork().
> 
> 
> However, inherited events don't have a filedesc to fput(), and
> similarly, a task that fails for has never been visible to attach a perf
> event to because it never hits the pid-hash.
> 
> Or so it is assumed.
> 
> I'm forever getting lost in the PID code. Oleg, is there any way
> find_task_by_vpid() can return a task that can still fail fork() ?

So I _think_ find_task_by_vpid() can return an already dead task; and
we'll happily increase task->usage.

Dmitry; I have no idea how easy it is for you to reproduce the thing;
but so far I've not had much success. Could you perhaps stick the below
in?

Once we convert task_struct to refcount_t that should generate a WARN of
its own I suppose.

---

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 000fdb2..612d652 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -763,6 +763,7 @@ struct perf_event_context {
 #ifdef CONFIG_CGROUP_PERF
 	int				nr_cgroups;	 /* cgroup evts */
 #endif
+	int				switches;
 	void				*task_ctx_data; /* pmu specific data */
 	struct rcu_head			rcu_head;
 };
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 6f41548f..6455b7a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2902,6 +2902,8 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn,
 	if (!parent && !next_parent)
 		goto unlock;
 
+	ctx->switches++;
+
 	if (next_parent == ctx || next_ctx == parent || next_parent == parent) {
 		/*
 		 * Looks like the two contexts are clones, so we might be
@@ -3780,6 +3782,12 @@ find_lively_task_by_vpid(pid_t vpid)
 		task = current;
 	else
 		task = find_task_by_vpid(vpid);
+
+	if (task) {
+		if (WARN_ON_ONCE(task->flags & PF_EXITING))
+			task = NULL;
+	}
+
 	if (task)
 		get_task_struct(task);
 	rcu_read_unlock();
@@ -10432,6 +10440,10 @@ void perf_event_free_task(struct task_struct *task)
 
 		mutex_unlock(&ctx->mutex);
 
+		WARN_ON_ONCE(ctx->switches);
+		WARN_ON_ONCE(atomic_read(&ctx->refcount) != 1);
+		WARN_ON_ONCE(ctx->task != task);
+
 		put_ctx(ctx);
 	}
 }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ