lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 28 Nov 2016 20:23:19 +0000
From:   "Liang, Kan" <kan.liang@...el.com>
To:     Stephane Eranian <eranian@...gle.com>
CC:     Peter Zijlstra <peterz@...radead.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>,
        "Odzioba, Lukasz" <lukasz.odzioba@...el.com>
Subject: RE: [PATCH] perf/x86: fix event counter update issue



> >> > Case B: In PMI. new > prev. delta is positive.
> >> >    That's the failure case of Test 3.
> >> >    The PMI is triggered by overflow. But there is latency between
> >> >    overflow and PMI handler. So new has small amount.
> >> >    Current calculation lose the max count value.
> >> >
> >> > Case C: In PMI. new < prev. delta is negative.
> >> >    The PMU counter may be start from a big value. E.g. the fixed period
> >> >    is small.
> >> >    It can be fixed by adding the max count value.
> >> >
> >> I am not sure I understand why PMI should matter here. What matters
> >> is prev vs. current and the wrap-around.
> >> Please explain.
> >> Thanks.
> >
> > Right, PMI shouldn't matter here.
> > We should add max count value if delta is negative, no matter if it’s in
> PMI.
> >
> That sounds right, but then why do you have that boolean (bool pmi) in
> your patch?

For case B. It needs to add max count value if new > prev in PMI.

Thanks,
Kan

> 
> > To fix this issue, I once had a list which include all the possible cases.
> > "non-PMI, new < prev, delta is negative" is one of them. But it rarely
> happen.
> > So I remove it, but forget to modify the description of Case C.
> >
> > Thanks,
> > Kan
> >
> >>
> >> >
> >> > There is still a case which delta could be wrong.
> >> > The case is that event update just happens in PMI and overflow gap.
> >> > It's not in PMI, new > prev, and delta is positive. Current
> >> > calculation may lose the max count value.
> >> > But this case rarely happens. So the rare case doesn't handle here.
> >> >
> >> > Reported-by: Lukasz Odzioba <lukasz.odzioba@...el.com>
> >> > Signed-off-by: Kan Liang <kan.liang@...el.com>
> >> > Tested-by: Lukasz Odzioba <lukasz.odzioba@...el.com>
> >> > ---
> >> >  arch/x86/events/core.c       | 23 +++++++++++++++++++----
> >> >  arch/x86/events/intel/core.c |  4 ++--
> >> >  arch/x86/events/intel/p4.c   |  2 +-
> >> >  arch/x86/events/perf_event.h |  2 +-
> >> >  4 files changed, 23 insertions(+), 8 deletions(-)
> >> >
> >> > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index
> >> > 6c3b0ef..c351ac4 100644
> >> > --- a/arch/x86/events/core.c
> >> > +++ b/arch/x86/events/core.c
> >> > @@ -63,7 +63,7 @@ u64 __read_mostly hw_cache_extra_regs
> >> >   * Can only be executed on the CPU where the event is active.
> >> >   * Returns the delta events processed.
> >> >   */
> >> > -u64 x86_perf_event_update(struct perf_event *event)
> >> > +u64 x86_perf_event_update(struct perf_event *event, bool pmi)
> >> >  {
> >> >         struct hw_perf_event *hwc = &event->hw;
> >> >         int shift = 64 - x86_pmu.cntval_bits; @@ -100,6 +100,21 @@
> >> > u64 x86_perf_event_update(struct perf_event *event)
> >> >         delta = (new_raw_count << shift) - (prev_raw_count << shift);
> >> >         delta >>= shift;
> >> >
> >> > +       /*
> >> > +        * Correct delta for special cases caused by the late PMI handle.
> >> > +        * Case1: PMU counter may be start from a big value.
> >> > +        *        In PMI, new < prev. delta is negative.
> >> > +        * Case2: new is big enough which impact the sign flag.
> >> > +        *        The delta will be negative after arithmetic right shift.
> >> > +        * Case3: In PMI, new > prev.
> >> > +        *        The new - prev lose the max count value.
> >> > +        *
> >> > +        * There may be event update in PMI and overflow gap,
> >> > +        * but it rarely happen. The rare case doesn't handle here.
> >> > +        */
> >> > +       if (((delta > 0) && pmi) || (delta < 0))
> >> > +               delta += x86_pmu.cntval_mask + 1;
> >> > +
> >> >         local64_add(delta, &event->count);
> >> >         local64_sub(delta, &hwc->period_left);
> >> >
> >> > @@ -1342,7 +1357,7 @@ void x86_pmu_stop(struct perf_event *event,
> >> int flags)
> >> >                  * Drain the remaining delta count out of a event
> >> >                  * that we are disabling:
> >> >                  */
> >> > -               x86_perf_event_update(event);
> >> > +               x86_perf_event_update(event, (flags == 0));
> >> >                 hwc->state |= PERF_HES_UPTODATE;
> >> >         }
> >> >  }
> >> > @@ -1446,7 +1461,7 @@ int x86_pmu_handle_irq(struct pt_regs *regs)
> >> >
> >> >                 event = cpuc->events[idx];
> >> >
> >> > -               val = x86_perf_event_update(event);
> >> > +               val = x86_perf_event_update(event, true);
> >> >                 if (val & (1ULL << (x86_pmu.cntval_bits - 1)))
> >> >                         continue;
> >> >
> >> > @@ -1867,7 +1882,7 @@ early_initcall(init_hw_perf_events);
> >> >
> >> >  static inline void x86_pmu_read(struct perf_event *event)  {
> >> > -       x86_perf_event_update(event);
> >> > +       x86_perf_event_update(event, false);
> >> >  }
> >> >
> >> >  /*
> >> > diff --git a/arch/x86/events/intel/core.c
> >> > b/arch/x86/events/intel/core.c index a74a2db..69d65e6 100644
> >> > --- a/arch/x86/events/intel/core.c
> >> > +++ b/arch/x86/events/intel/core.c
> >> > @@ -1830,7 +1830,7 @@ static void intel_pmu_nhm_workaround(void)
> >> >         for (i = 0; i < 4; i++) {
> >> >                 event = cpuc->events[i];
> >> >                 if (event)
> >> > -                       x86_perf_event_update(event);
> >> > +                       x86_perf_event_update(event, false);
> >> >         }
> >> >
> >> >         for (i = 0; i < 4; i++) {
> >> > @@ -2002,7 +2002,7 @@ static void intel_pmu_add_event(struct
> >> perf_event *event)
> >> >   */
> >> >  int intel_pmu_save_and_restart(struct perf_event *event)  {
> >> > -       x86_perf_event_update(event);
> >> > +       x86_perf_event_update(event, true);
> >> >         /*
> >> >          * For a checkpointed counter always reset back to 0.  This
> >> >          * avoids a situation where the counter overflows, aborts
> >> > the diff --git a/arch/x86/events/intel/p4.c
> >> > b/arch/x86/events/intel/p4.c index eb05335..8117de8 100644
> >> > --- a/arch/x86/events/intel/p4.c
> >> > +++ b/arch/x86/events/intel/p4.c
> >> > @@ -1024,7 +1024,7 @@ static int p4_pmu_handle_irq(struct pt_regs
> >> *regs)
> >> >                 /* it might be unflagged overflow */
> >> >                 overflow = p4_pmu_clear_cccr_ovf(hwc);
> >> >
> >> > -               val = x86_perf_event_update(event);
> >> > +               val = x86_perf_event_update(event, true);
> >> >                 if (!overflow && (val & (1ULL << (x86_pmu.cntval_bits - 1))))
> >> >                         continue;
> >> >
> >> > diff --git a/arch/x86/events/perf_event.h
> >> > b/arch/x86/events/perf_event.h index c6b25ac..09c9db2 100644
> >> > --- a/arch/x86/events/perf_event.h
> >> > +++ b/arch/x86/events/perf_event.h
> >> > @@ -712,7 +712,7 @@ extern u64 __read_mostly
> hw_cache_extra_regs
> >> >                                 [PERF_COUNT_HW_CACHE_OP_MAX]
> >> >                                 [PERF_COUNT_HW_CACHE_RESULT_MAX];
> >> >
> >> > -u64 x86_perf_event_update(struct perf_event *event);
> >> > +u64 x86_perf_event_update(struct perf_event *event, bool pmi);
> >> >
> >> >  static inline unsigned int x86_pmu_config_addr(int index)  {
> >> > --
> >> > 2.4.3
> >> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ