linux-kernel - Re: [REGRESSION] x86, perf: counter freezing breaks rr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181120221642.GE2131@hirez.programming.kicks-ass.net>
Date:   Tue, 20 Nov 2018 23:16:42 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Andi Kleen <ak@...ux.intel.com>
Cc:     Kyle Huey <me@...ehuey.com>, Kan Liang <kan.liang@...ux.intel.com>,
        Ingo Molnar <mingo@...nel.org>,
        Robert O'Callahan <robert@...llahan.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Stephane Eranian <eranian@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vince Weaver <vincent.weaver@...ne.edu>, acme@...nel.org,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [REGRESSION] x86, perf: counter freezing breaks rr

On Tue, Nov 20, 2018 at 12:11:44PM -0800, Andi Kleen wrote:
> > > > Given that we're already at rc3, and that this renders rr unusable,
> > > > we'd ask that counter freezing be disabled for the 4.20 release.
> > >
> > > The boot option should be good enough for the release?
> > 
> > I'm not entirely sure what you mean here. We want you to flip the
> > default boot option so this feature is off for this release. i.e. rr
> > should work by default on 4.20 and people should have to opt into the
> > inaccurate behavior if they want faster PMI servicing.
> 
> I don't think it's inaccurate, it's just different 
> than what you are used to.
> 
> For profiling including the kernel it's actually far more accurate
> because the count is stopped much earlier near the sampling
> point. Otherwise there is a considerable over count into
> the PMI handler.
> 
> In your case you limit the count to ring 3 so it's always cut off
> at the transition point into the kernel, while with freezing
> it's at the overflow point.

Ooh, so the thing does FREEZE_ON_OVERFLOW _not_ FREEZE_ON_PMI. Yes, that
can be a big difference.

See, FREEZE_ON_PMI, as advertised by the name, should have no observable
effect on counters limited to USR. But something like FREEZE_ON_OVERFLOW
will loose everything between the overflow and the eventual PMI, and by
freezing early we can't even compensate for it anymore either,
introducing drift in the period.

And I don't buy the over-count argument, the counter register shows how
far over you are; it triggers the overflow when we cross 0, it then
continues counting. So if you really care, you can throw away the
'over-count' at PMI time. That doesn't make it more reliable. We don't
magically get pt_regs from earlier on or any other state.

The only thing where it might make a difference is if you're running
multiple counters (groups in perf speak) and want to correlate the count
values. Then, and only then, does it matter.

Bah.