linux-kernel - Re: [tip:perf/core] perf: Ignore non-sampling overflows

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110629103714.GM4590@erda.amd.com>
Date:	Wed, 29 Jun 2011 12:37:14 +0200
From:	Robert Richter <robert.richter@....com>
To:	Francis Moreau <francis.moro@...il.com>
CC:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"linux-tip-commits@...r.kernel.org" 
	<linux-tip-commits@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"hpa@...or.com" <hpa@...or.com>,
	"mingo@...hat.com" <mingo@...hat.com>,
	"tglx@...utronix.de" <tglx@...utronix.de>,
	"mingo@...e.hu" <mingo@...e.hu>
Subject: Re: [tip:perf/core] perf: Ignore non-sampling overflows

On 28.06.11 07:56:03, Francis Moreau wrote:
> On Tue, Jun 28, 2011 at 1:05 PM, Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> > On Tue, 2011-06-28 at 12:53 +0200, Robert Richter wrote:
> >> > --- a/kernel/perf_event.c
> >> > +++ b/kernel/perf_event.c
> >> > @@ -4240,6 +4240,13 @@ static int __perf_event_overflow(struct perf_event *event, int nmi,
> >> >     struct hw_perf_event *hwc = &event->hw;
> >> >     int ret = 0;
> >> >
> >> > +   /*
> >> > +    * Non-sampling counters might still use the PMI to fold short
> >> > +    * hardware counters, ignore those.
> >> > +    */
> >> > +   if (unlikely(!is_sampling_event(event)))
> >> > +           return 0;
> >> > +
> >
> >> do you remember the background of this change. This check silently
> >> drops data of non-sampling events. I want to use perf_event_overflow()
> >> to write to the buffer and want to modify the check, but don't see
> >> which 'accidentally' interrupts may occur that must be ignored.
> >
> > IIRC this is because we always program the interrupt bit, such that when
> > the counter overflows we can account and reprogram the thing. This is
> > needed because no hardware counter is in fact 64 bits wide. Therefore we
> > have to program the counter to its max width and properly account the
> > state and reprogram on overflow.
> >
> > Imagine a 32bit cycle counter (@1GHz), if we were not to program that as
> > taking interrupts and nobody would read that counter for about 4.2
> > seconds, we'd have overflowed and lost the actual count value for the
> > thing.
> >
> > So what we do is program is at 31bits (so that the msb can toggle and
> > trigger the interrupt), and on interrupt add to event->count, and reset
> > the hardware to start counting again.
> >
> > Now some arch/*/perf_event.c implementations unconditionally called
> > perf_event_overflow() from their IRQ handler, even for such non-sampling
> > counters.

I looked at the interrupt handlers. The events are always determined
from a per-cpu array:

	cpuc = &__get_cpu_var(cpu_hw_events);
	...
	event = cpuc->events[idx];

In case of interrupts the event should then always be a hw event (or
uninitialized).  Even if the interrupt was triggered by a different
source, it would always be mapped to the same event and the check
is_sampling_event() would be meaningless.

There are other occurrences of perf_event_overflow() in
kernel/events/core.c for events of type PERF_TYPE_SOFTWARE. These
events are initialized with sample_period set and a check would always
be true too.

For both cases I stil don't see a reason for the check.

Anyway, would the following extentension of the check above ok?

	if (unlikely(!is_sampling_event(event) && !event->attr.sample_type))
		...

With no bits set in attr.sample_type the sample would be empty and
nothing to report. Now, with this change, samples that have data to
report wouldn't be dropped anymore.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/