linux-kernel - Re: [PATCH 07/12] perf, x86: Avoid checkpointed counters causing excessive TSX aborts v3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 29 Jan 2013 01:30:19 +0100
From:	Stephane Eranian <eranian@...gle.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Ingo Molnar <mingo@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Jiri Olsa <jolsa@...hat.com>,
	Namhyung Kim <namhyung@...nel.org>,
	Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH 07/12] perf, x86: Avoid checkpointed counters causing
 excessive TSX aborts v3

On Tue, Jan 29, 2013 at 12:16 AM, Andi Kleen <andi@...stfloor.org> wrote:
>> I don't buy really this workaround. You are assuming you're always
>> measuring INTC_CHECKPOINTED
>> event by itself.
>
> There's no such assumption.
>
>> So what if you get into the handler because of an PMI
>> due to an overflow
>> of another counter which is active at the same time as counter2?
>> You're going to artificially
>> add an overflow to counter2. Unless you're enforcing only counter2 in use.
>
> All the code does it to always check the counter. There's no
> "overflow added". For counting it may be set back and accumulated
> a bit earlier than normal, but that's no problem. This will only
> happen for a checkpointed counter 2, not for anything else.
>
Ok, you're right. I misunderstood the point of the check. Yes, it systematically
adds INTX_CP to the list of events to check. That does not mean it will detect
an overflow.

>> The counter is reinstated to its state before the critical section but
>> the PMI cannot be
>> cancelled and there is no state left behind to tell what to do with it.
>
> The PMI is effectively spurious, but we use it to set back. Don't know
> what you mean with "cancel". It already happened of course.
>
But when you do this, it seems you making INT_CP events unusable
for sampling, because you're resetting their value under the cover.
So what happens when you sample, especially with a fixed period?

>
>> static inline bool is_event_intx_cp(struct perf_event *event)
>> {
>>    return event && (event->hw.config & HSW_INTX_CHECKPOINTED);
>> }
>
> They both look the same to me.
>>
I think you understand what I meant by this. You substitue all the
long checks by the inline. It does not change anything to the code, it
makes is easier to read and avoid long lines.

>>
>> >         for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
>> >                 struct perf_event *event = cpuc->events[bit];
>> >
>> > @@ -1615,6 +1635,20 @@ static int hsw_hw_config(struct perf_event *event)
>> >              ((event->hw.config & ARCH_PERFMON_EVENTSEL_ANY) ||
>> >               event->attr.precise_ip > 0))
>> >                 return -EIO;
>> > +       if (event->hw.config & HSW_INTX_CHECKPOINTED) {
>> > +               /*
>> > +                * Sampling of checkpointed events can cause situations where
>> > +                * the CPU constantly aborts because of a overflow, which is
>> > +                * then checkpointed back and ignored. Forbid checkpointing
>> > +                * for sampling.
>> > +                *
>> > +                * But still allow a long sampling period, so that perf stat
>> > +                * from KVM works.
>> > +                */
>>
>> What has perf stat have to do with sample_period?
>
> It always uses a period to accumulate in a larger counter as you probably know.
> Also with the other code we only allow checkpoint with stat.
>
Yes, I know.

>
>>
>> > +               if (event->attr.sample_period > 0 &&
>> > +                   event->attr.sample_period < 0x7fffffff)
>> > +                       return -EIO;
>> > +       }
Explain the 0x7fffffff to me? Is that the max period set by default when you
just count?


>> same comment about -EIO vs. EOPNOTSUPP. sample_period is u64
>> so, it's always >= 0. Where does this 31-bit limit come from?
>
> That's what perf stat uses when running in the KVM guest.
>
>> Experimentation?
>
> The code does > 0, not >= 0
>
>> Could be written:
>>       if (event->attr.sample_period && event->attr.sample_period < 0x7fffffff)
>
> That's 100% equivalent to what I wrote.
>
I know.
Usually when I see x > 0, I interpret as to mean the field could be negative.
That's what I was trying to say. However, here we know it cannot be. No
big deal.

> I can change the error value.

Ok.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/