linux-kernel - Re: [PATCH 3.2 114/126] perf/core: Fix concurrent sys_perf_event_open() vs. 'move

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1487637984.2885.8.camel@decadent.org.uk>
Date:   Tue, 21 Feb 2017 00:46:24 +0000
From:   Ben Hutchings <ben@...adent.org.uk>
To:     linux-kernel@...r.kernel.org, stable@...r.kernel.org
Cc:     akpm@...ux-foundation.org, John Dias <joaodias@...gle.com>,
        Di@...adent.org.uk, Thomas Gleixner <tglx@...utronix.de>,
        Vince Weaver <vincent.weaver@...ne.edu>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Jiri Olsa <jolsa@...hat.com>,
        Stephane Eranian <eranian@...gle.com>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Kees Cook <keescook@...omium.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Min Chong <mchong@...gle.com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>
Subject: Re: [PATCH 3.2 114/126] perf/core: Fix concurrent
 sys_perf_event_open() vs. 'move_group' race

On Wed, 2017-02-15 at 22:41 +0000, Ben Hutchings wrote:
> 3.2.85-rc1 review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Peter Zijlstra <peterz@...radead.org>
> 
> commit 321027c1fe77f892f4ea07846aeae08cefbbb290 upstream.
> 
> Di Shen reported a race between two concurrent sys_perf_event_open()
> calls where both try and move the same pre-existing software group
> into a hardware context.
> 
> The problem is exactly that described in commit:
> 
>   f63a8daa5812 ("perf: Fix event->ctx locking")
> 
> ... where, while we wait for a ctx->mutex acquisition, the event->ctx
> relation can have changed under us.
> 
> That very same commit failed to recognise sys_perf_event_context() as an
> external access vector to the events and thereby didn't apply the
> established locking rules correctly.
> 
> So while one sys_perf_event_open() call is stuck waiting on
> mutex_lock_double(), the other (which owns said locks) moves the group
> about. So by the time the former sys_perf_event_open() acquires the
> locks, the context we've acquired is stale (and possibly dead).
> 
> Apply the established locking rules as per perf_event_ctx_lock_nested()
> to the mutex_lock_double() for the 'move_group' case. This obviously means
> we need to validate state after we acquire the locks.
[...]
>  		/*
>  		 * See perf_event_ctx_lock() for comments on the details
>  		 * of swizzling perf_event::ctx.
>  		 */
> -		mutex_lock_double(&gctx->mutex, &ctx->mutex);
> -
>  		perf_remove_from_context(group_leader, false);
>  
>  		/*
> @@ -6709,10 +6757,8 @@ SYSCALL_DEFINE5(perf_event_open,
>  	++ctx->generation;
>  	perf_unpin_context(ctx);
>  
> -	if (move_group) {
> -		mutex_unlock(&gctx->mutex);
> -		put_ctx(gctx);
> -	}
> +	if (move_group)
> +		perf_event_ctx_unlock(group_leader, gctx);
>  	mutex_unlock(&ctx->mutex);
>  
>  	event->owner = current;
[...]

Peter has clarified that the last call to put_ctx(gctx) corresponds to
the reference cleared by perf_remove_from_context(group_leader, false)
above.  So although perf_event_ctx_unlock() also calls put_ctx(gctx),
we really do want to drop two references here now and should keep the
direct call.

I made the same error when backporting to 3.16, and will fix that as
well.

Ben.

-- 
Ben Hutchings
73.46% of all statistics are made up.

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)