linux-kernel - Re: [PATCH 4/7] perf: Free aux pages in unmap path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87si3cx7ts.fsf@ashishki-desk.ger.corp.intel.com>
Date:	Wed, 09 Dec 2015 11:57:51 +0200
From:	Alexander Shishkin <alexander.shishkin@...ux.intel.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
	vince@...ter.net, eranian@...gle.com, johannes@...solutions.net,
	Arnaldo Carvalho de Melo <acme@...radead.org>
Subject: Re: [PATCH 4/7] perf: Free aux pages in unmap path

Peter Zijlstra <peterz@...radead.org> writes:

> Yuck, nasty problem. Also, I think its broken. By not having
> mmap_mutex around the whole thing, notably rb_free_aux(), you can race
> against mmap().
>
> What seems possible now is that:
>
> 	mmap(aux); // rb->aux_mmap_count == 1
> 	munmap(aux)
> 	  atomic_dec_and_mutex_lock(&rb->aux_mmap_count, &event->mmap_mutex); // == 0
>
> 	  mutex_unlock(&event->mmap_mutex);
>
> 					mmap(aux)
> 					  if (rb_has_aux())
> 					    atomic_inc(&rb->aux_mmap_count); // == 1
>
> 	  rb_free_aux(); // oops!!

Wait, this isn't actually a problem, we can hold mmap_mutex over
rb_free_aux(), as we actually already do in current code. My patch did
it wrongly though, but there's really no reason to drop the mutex before
rb_free_aux().

> So I thought that pulling all the aux bits out from the ring_buffer
> struct, such that we have rb->aux, would solve the issue in that we can
> then fix mmap() to have the same retry loop as for event->rb.
>
> And while that fixes that race (I almost had that patch complete -- I
> might still send it out, just so you can see what it looks like), it
> doesn't solve the complete problem I don't think.

I was toying with that some time ago, but I couldn't really see the
benefits that would justify the hassle.

> Because in that case, you want the event to start again on the new
> buffer, and I think its possible we end up calling ->start() before
> we've issued the ->stop() and that would be BAD (tm).

So if we just hold the mmap_mutex over rb_free_aux(), this won't
happen, right?

> The only solution I've come up with is:
>
> 	struct rb_aux *aux = rb->aux;
>
> 	if (aux && vma->vm_pgoff == aux->pgoff) {
> 		ctx = perf_event_ctx_lock(event);
> 		if (!atomic_dec_and_mutex_lock(&aux->mmap_count, &event->mmap_mutex) {
> 			/* we now hold both ctx::mutex and event::mmap_mutex */
> 			rb->aux = NULL;
> 			ring_buffer_put(rb); /* aux had a reference */
> 			_perf_event_stop(event);

Here we really need to ensure that none of the events on the
rb->event_list is running, not just the parent, and that still presents
complications wrt irqsave rb->event_lock even with your new idea for
perf_event_stop().

How about something like this to stop the writers:

static int __ring_buffer_output_stop(void *info)
{
	struct ring_buffer *rb = info;
	struct perf_event *event;
 
	spin_lock(&rb->event_lock);
	list_for_each_entry_rcu(event, &rb->event_list, rb_entry) {
		if (event->state != PERF_EVENT_STATE_ACTIVE)
			continue;

		event->pmu->stop(event, PERF_EF_UPDATE);
	}
	spin_unlock(&rb->event_lock);

	return 0;
}

static void perf_event_output_stop(struct perf_event *event)
{
	struct ring_buffer *rb = event->rb;

	lockdep_assert_held(&event->mmap_mutex);

	if (event->cpu == -1)
		perf_event_stop(event);

	cpu_function_call(event->cpu, __ring_buffer_output_stop, rb);
}
 
And then in the mmap_close:

	if (rb_has_aux(rb) && vma->vm_pgoff == rb->aux_pgoff &&
	    atomic_dec_and_mutex_lock(&rb->aux_mmap_count, &event->mmap_mutex)) {
		perf_event_output_stop(event);

                /* undo the mlock accounting here */

 		rb_free_aux(rb);
               	mutex_unlock(&event->mmap_mutex);
	}

Regards,
--
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/