lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Thu, 16 May 2024 13:17:16 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH 3/4] perf: Fix event leak upon exit

On Thu, May 16, 2024 at 11:05:29AM +0200, Peter Zijlstra wrote:
> On Wed, May 15, 2024 at 04:43:10PM +0200, Frederic Weisbecker wrote:
> > When a task is scheduled out, pending sigtrap deliveries are deferred
> > to the target task upon resume to userspace via task_work.
> > 
> > However failures while adding en event's callback to the task_work
> > engine are ignored. And since the last call for events exit happen
> > after task work is eventually closed, there is a small window during
> > which pending sigtrap can be queued though ignored, leaking the event
> > refcount addition such as in the following scenario:
> > 
> >     TASK A
> >     -----
> > 
> >     do_exit()
> >        exit_task_work(tsk);
> > 
> >        <IRQ>
> >        perf_event_overflow()
> >           event->pending_sigtrap = pending_id;
> >           irq_work_queue(&event->pending_irq);
> >        </IRQ>
> >     =========> PREEMPTION: TASK A -> TASK B
> >        event_sched_out()
> >           event->pending_sigtrap = 0;
> >           atomic_long_inc_not_zero(&event->refcount)
> >           // FAILS: task work has exited
> >           task_work_add(&event->pending_task)
> >        [...]
> >        <IRQ WORK>
> >        perf_pending_irq()
> >           // early return: event->oncpu = -1
> >        </IRQ WORK>
> >        [...]
> >     =========> TASK B -> TASK A
> >        perf_event_exit_task(tsk)
> >           perf_event_exit_event()
> >              free_event()
> >                 WARN(atomic_long_cmpxchg(&event->refcount, 1, 0) != 1)
> >                 // leak event due to unexpected refcount == 2
> > 
> > As a result the event is never released while the task exits.
> 
> Urgh...
> 
> > 
> > Fix this with appropriate task_work_add()'s error handling.
> > 
> > Fixes: 517e6a301f34 ("perf: Fix perf_pending_task() UaF")
> > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > ---
> >  kernel/events/core.c | 9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 724e6d7e128f..c1632e69c69d 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -2289,10 +2289,11 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
> >  		event->pending_sigtrap = 0;
> >  		if (state != PERF_EVENT_STATE_OFF &&
> >  		    !event->pending_work) {
> > -			event->pending_work = 1;
> > -			dec = false;
> > -			WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
> > -			task_work_add(current, &event->pending_task, TWA_RESUME);
> > +			if (task_work_add(current, &event->pending_task, TWA_RESUME) >= 0) {
> 
> AFAICT the thing is a return 0 on success -Efoo on fail, no? That is,
> should this not simply be '== 0' ?

Right.

> 
> > +				WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
> > +				dec = false;
> > +				event->pending_work = 1;
> > +			}
> 
> Also, do we want to write it like so and save an indent?
> 
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2288,11 +2288,11 @@ event_sched_out(struct perf_event *event
>  
>  		event->pending_sigtrap = 0;
>  		if (state != PERF_EVENT_STATE_OFF &&
> -		    !event->pending_work) {
> +		    !event->pending_work &&
> +		    !task_work_add(current, &event->pending_task, TWA_RESUME)) {
>  			event->pending_work = 1;
>  			dec = false;
>  			WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
> -			task_work_add(current, &event->pending_task, TWA_RESUME);
>  		}
>  		if (dec)
>  			local_dec(&event->ctx->nr_pending);

Looks good, I'm resending this one patch.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ