lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 24 Nov 2020 00:19:34 -0800
From:   Stephane Eranian <eranian@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Namhyung Kim <namhyung@...nel.org>, Ingo Molnar <mingo@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>, x86 <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Kan Liang <kan.liang@...ux.intel.com>,
        John Sperbeck <jsperbeck@...gle.com>,
        "Lendacky, Thomas" <Thomas.Lendacky@....com>,
        Andi Kleen <ak@...ux.intel.com>
Subject: Re: [RFC] perf/x86: Fix a warning on x86_pmu_stop()

Hi,

Another remark on the PEBS drainage code, it seems to me like a test
is not quite correct:
intel_pmu_drain_pebs_nhm()
{
...
               if (p->status != (1ULL << bit)) {
                        for_each_set_bit(i, (unsigned long *)&pebs_status, size)
                                error[i]++;
                        continue;
                }

The kernel cannot disambiguate when 2+ PEBS counters overflow at the
same time. This is what the comment for this code suggests.
However, I see the comparison is done with the unfiltered p->status
which is a copy of  IA32_PERF_GLOBAL_STATUS at the time of
the sample. This register contains more than the PEBS counter overflow
bits. It also includes many other bits which could also be set.

Shouldn't this test use pebs_status instead (which covers only the
PEBS counters)?

          if (pebs_status != (1ULL << bit)) {
          }

Or am I missing something?
Thanks.


On Tue, Nov 24, 2020 at 12:09 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Nov 24, 2020 at 02:01:39PM +0900, Namhyung Kim wrote:
>
> > Yes, it's not about __intel_pmu_pebs_event().  I'm looking at
> > intel_pmu_drain_pebs_nhm() specifically.  There's code like
> >
> >         /* log dropped samples number */
> >         if (error[bit]) {
> >             perf_log_lost_samples(event, error[bit]);
> >
> >             if (perf_event_account_interrupt(event))
> >                 x86_pmu_stop(event, 0);
> >         }
> >
> >         if (counts[bit]) {
> >             __intel_pmu_pebs_event(event, iregs, base,
> >                            top, bit, counts[bit],
> >                            setup_pebs_fixed_sample_data);
> >         }
> >
> > There's a path to x86_pmu_stop() when an error bit is on.
>
> That would seem to suggest you try something like this:
>
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index 31b9e58b03fe..8c6ee8be8b6e 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -1945,7 +1945,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
>                 if (error[bit]) {
>                         perf_log_lost_samples(event, error[bit]);
>
> -                       if (perf_event_account_interrupt(event))
> +                       if (iregs && perf_event_account_interrupt(event))
>                                 x86_pmu_stop(event, 0);
>                 }
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ