lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTimp=+1GzBOYaUZWtDF6teGt6FZe+RTpb9fAyOyd@mail.gmail.com>
Date:	Wed, 29 Sep 2010 21:42:26 +0200
From:	Stephane Eranian <eranian@...gle.com>
To:	Don Zickus <dzickus@...hat.com>
Cc:	Robert Richter <robert.richter@....com>,
	Cyrill Gorcunov <gorcunov@...il.com>,
	"mingo@...hat.com" <mingo@...hat.com>,
	"hpa@...or.com" <hpa@...or.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"yinghai@...nel.org" <yinghai@...nel.org>,
	"andi@...stfloor.org" <andi@...stfloor.org>,
	"peterz@...radead.org" <peterz@...radead.org>,
	"ying.huang@...el.com" <ying.huang@...el.com>,
	"fweisbec@...il.com" <fweisbec@...il.com>,
	"ming.m.lin@...el.com" <ming.m.lin@...el.com>,
	"tglx@...utronix.de" <tglx@...utronix.de>,
	"mingo@...e.hu" <mingo@...e.hu>
Subject: Re: [tip:perf/urgent] perf, x86: Catch spurious interrupts after
 disabling counters

On Wed, Sep 29, 2010 at 8:12 PM, Don Zickus <dzickus@...hat.com> wrote:
> On Wed, Sep 29, 2010 at 07:09:24PM +0200, Robert Richter wrote:
>> On 29.09.10 12:00:35, Stephane Eranian wrote:
>> > Here is a scenario:
>> >
>> > event A -> counter 0, cpuc->running = 0x1 active_mask = 0x1
>> > move A
>> > event A -> counter 1, cpuc->running = 0x3, active_mask = 0x2
>> >
>> > No interrupt, we are just counting for a short period.
>> > Then, you get an NMI interrupt, suppose it is not generated
>> > by the PMU, it is destined for another handler.
>> >
>> > For i=0, you have (active_mask & 0x1) == 0, but (running & 0x1) == 1,
>> > you mark the interrupt as handled, i.e., you swallow it, the actual
>> > handler never gets it.
>>
>> Yes, then changing the counters you will get *one* nmi with 2 handled
>> counters. This is valid as the disabled counter could generate a
>> spurious interrupt. But you get (handled == 2) instead of (handled ==
>> 1) which is not much impact. All following nmis have (handled == 1)
>> then again.
>
> Robert,
>
> I think you missed Stephane's point.  Say for example, kgdb is being used
> while we are doing stuff with the perf counter (and say kgdb's handler is
> a lower priority than perf; which isn't true I know, but let's say):
>
Yes, exactly my point. The reality is you cannot afford to have false positive
because you may starve another subsystem from an important notification.

I think it boils down to whether or not we need an error message (Dazed) in
case no subsystem claimed the NMI. If you were to just silently consume the
NMI when no subsystem claims it, then you would not have these issues.

What Don has done is use a heuristic which gets activated when a PMU
interrupt handler signals that more than one counter have overflowed. His
claim is that this situation is likely to trigger back-to-back.

The reason this heuristic works is because it waits until ALL the subsystems
have seen the notification before it declares that the NMI was PMU spurious.
To do that is uses the DIE_NMI_UNKNOWN callchain. Handler on this chain
get call last, after all subsystems have seen the notification once. I believe
that is the only way to safely "consume" a "spurious" NMI and avoid
the 'Dazed' message. Anything else runs the risks of starving the other
subsystems.

> perf NMI comes in, issues pmu_stop 'cleanly' (meaning no spurious
> interrupt).  The cpuc->running bit is never cleared.
>
> kgdb NMI comes in, but the die_chain dictates perf looks at it first.
> perf will see that cpuc->active == 0 and cpuc->running == 1 and bump
> handled.  Thus returning NOTIFY_STOP.  kgdb never sees the NMI. :-(
>
> Now I sent a patch last week that can prevent that extra NMI from being
> generated at the cost of another rdmsrl in the non-pmu_stop cases (which I
> will attach below again, obviously P4 would need something similar too).
>
> I think we currently don't see the problems Stephane describes because the
> only thing we test that uses NMIs are perf, which also happens to be a low
> priority on the die_chain.
>
> But it is an interesting scenario that we should look at more.
>
> Cheers,
> Don
>
>
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 48c6d8d..1642f48 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -1175,11 +1175,22 @@ static int x86_pmu_handle_irq(struct pt_regs *regs)
>                handled++;
>                data.period     = event->hw.last_period;
>
> -               if (!x86_perf_event_set_period(event))
> -                       continue;
> +               /*
> +                * if period is over, process the overflow
> +                * before reseting the counter, otherwise
> +                * a new overflow could occur before the
> +                * event is stopped
> +                */
> +               if (local64_read(&hwc->period_left) <= 0) {
> +                       if (perf_event_overflow(event, 1, &data, regs)) {
> +                               x86_pmu_stop(event, 0);
> +                               continue;
> +                       }
> +                       /* if the overflow doesn't stop the event, resync */
> +                       x86_perf_event_update(event);
> +               }
>
> -               if (perf_event_overflow(event, 1, &data, regs))
> -                       x86_pmu_stop(event, 0);
> +               x86_perf_event_set_period(event);
>        }
>
>        if (handled)
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ