linux-kernel - Re: [PATCH] perf/core: Fix endless multiplex timer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200304093344.GJ2596@hirez.programming.kicks-ass.net>
Date:   Wed, 4 Mar 2020 10:33:44 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     "Liang, Kan" <kan.liang@...ux.intel.com>
Cc:     mingo@...hat.com, linux-kernel@...r.kernel.org, irogers@...gle.com,
        eranian@...gle.com, ak@...ux.intel.com
Subject: Re: [PATCH] perf/core: Fix endless multiplex timer

On Tue, Mar 03, 2020 at 08:40:10PM -0500, Liang, Kan wrote:
> > I'm thinking this is wrong.
> > 
> > That is, yes, this fixes the observed problem, but it also misses at
> > least one other site. Which seems to suggest we ought to take a
> > different approach.
> > 
> > But even with that; I wonder if the actual condition isn't wrong.
> > Suppose the event was exclusive, and other events weren't scheduled
> > because of that. Then you disable the one exclusive event _and_ kill
> > rotation, so then nothing else will ever get on.
> > 
> > So what I think was supposed to happen is rotation killing itself;
> > rotation will schedule out the context -- which will clear the flag, and
> > then schedule the thing back in -- which will set the flag again when
> > needed.
> > 
> > Now, that isn't happening, and I think I see why, because when we drop
> > to !nr_active, we terminate ctx_sched_out() before we get to clearing
> > the flag, oops!
> > 
> > So how about something like this?
> > 
> > ---
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index e453589da97c..7947bd3271a9 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -2182,6 +2182,7 @@ __perf_remove_from_context(struct perf_event *event,
> >   	if (!ctx->nr_events && ctx->is_active) {
> >   		ctx->is_active = 0;
> > +		ctx->rotate_necessary = 0;
> >   		if (ctx->task) {
> >   			WARN_ON_ONCE(cpuctx->task_ctx != ctx);
> >   			cpuctx->task_ctx = NULL;
> 
> 
> The patch can fix the observed problem with uncore PMU.
> But it cannot fix all the cases with core PMU, especially when NMI watchdog
> is enabled.
> Because the ctx->nr_events never be 0 with NMI watchdog enabled.

But, I'm confused.. why do we care about nr_events==0 ? The below: vvvv

> > @@ -3074,15 +3075,15 @@ static void ctx_sched_out(struct perf_event_context *ctx,
> >   	is_active ^= ctx->is_active; /* changed bits */
> > -	if (!ctx->nr_active || !(is_active & EVENT_ALL))
> > -		return;
> > -
> >   	/*
> >   	 * If we had been multiplexing, no rotations are necessary, now no events
> >   	 * are active.
> >   	 */
> >   	ctx->rotate_necessary = 0;
> > +	if (!ctx->nr_active || !(is_active & EVENT_ALL))
> > +		return;
> > +
> >   	perf_pmu_disable(ctx->pmu);
> >   	if (is_active & EVENT_PINNED) {
> >   		list_for_each_entry_safe(event, tmp, &ctx->pinned_active, active_list)

Makes sure we clear the flag when we ctx_sched_out(), and as long as
ctx->rotate_necessary is set, perf_rotate_context() will do exactly
that.

Then ctx_sched_in() will re-set the flag if it failed to schedule a
counter.

So where is that going wrong?