lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 20 Nov 2018 10:20:44 -0800
From:   Stephane Eranian <eranian@...gle.com>
To:     Kyle Huey <me@...ehuey.com>
Cc:     Andi Kleen <ak@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Liang, Kan" <kan.liang@...ux.intel.com>,
        Ingo Molnar <mingo@...nel.org>, robert@...llahan.org,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vince Weaver <vincent.weaver@...ne.edu>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [REGRESSION] x86, perf: counter freezing breaks rr

On Tue, Nov 20, 2018 at 9:08 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Nov 20, 2018 at 08:19:39AM -0800, Kyle Huey wrote:
> > tl;dr: rr is currently broken on 4.20rc2, which I bisected to
> > af3bdb991a5cb57c189d34aadbd3aa88995e0d9f. I further confirmed that
> > booting the 4.20rc2 kernel with `disable_counter_freezing=true` allows
> > rr to work.
> >
> > rr, a userspace record and replay debugger[0], uses the PMU interrupt
> > (PMI) to stop a program during replay to inject asynchronous events
> > such as signals. With perf counter freezing enabled we are reliably
> > seeing perf event overcounts during replay. This behavior is easily
> > demonstrated by attempting to record and replay the `alarm` test from
> > rr's test suite. Through bisection I determined that [1] is the first
> > bad commit, and further testing showed that booting the kernel with
> > `disable_counter_freezing=true` fixes rr.
> >
I would like to understand better the PMU behavior you are relying upon and
why the V4 freeze approach is breaking it. Could you elaborate?

> > This behavior has been observed on two different CPUs (a Core i7-6700K
> > and a Xeon E3-1505M v5). We have no reason to believe it is limited to
> > specific CPU models, this information is included only for
> > completeness.
> >
> > Given that we're already at rc3, and that this renders rr unusable,
> > we'd ask that counter freezing be disabled for the 4.20 release.
>
> Andi, can you have a look at this?
>
> Meanwhile, I suppose we should do something along these lines.
>
>
> ---
> Subject: perf/x86/intel: Default disable perfmon v4 interrupt handling
>
> Rework the 'disable_counter_freezing' __setup() parameter such that we
> can explicitly enable/disable it and switch to default disabled.
>
> To this purpose, rename the parameter to "perf_v4_pmi=" which is a much
> better description and allows requiring a bool argument.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
>  Documentation/admin-guide/kernel-parameters.txt |  3 ++-
>  arch/x86/events/intel/core.c                    | 12 ++++++++----
>  2 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 76c82c01bf5e..ff6d1d4229e0 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -856,7 +856,8 @@
>                         causing system reset or hang due to sending
>                         INIT from AP to BSP.
>
> -       disable_counter_freezing [HW]
> +       perf_v4_pmi=    [X86,INTEL]
> +                       Format: <bool>
>                         Disable Intel PMU counter freezing feature.
>                         The feature only exists starting from
>                         Arch Perfmon v4 (Skylake and newer).
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 273c62e81546..af8bea9d4006 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -2306,14 +2306,18 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
>         return handled;
>  }
>
> -static bool disable_counter_freezing;
> +static bool disable_counter_freezing = true;
>  static int __init intel_perf_counter_freezing_setup(char *s)
>  {
> -       disable_counter_freezing = true;
> -       pr_info("Intel PMU Counter freezing feature disabled\n");
> +       bool res;
> +
> +       if (kstrtobool(s, &res))
> +               return -EINVAL;
> +
> +       disable_counter_freezing = !res;
>         return 1;
>  }
> -__setup("disable_counter_freezing", intel_perf_counter_freezing_setup);
> +__setup("perf_v4_pmi=", intel_perf_counter_freezing_setup);
>
>  /*
>   * Simplified handler for Arch Perfmon v4:

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ