[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200110150958.GP2844@hirez.programming.kicks-ass.net>
Date: Fri, 10 Jan 2020 16:09:58 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Kim Phillips <kim.phillips@....com>
Cc: Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
Janakarajan Natarajan <Janakarajan.Natarajan@....com>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
Tom Lendacky <thomas.lendacky@....com>,
Stephane Eranian <eranian@...gle.com>,
Martin Liška <mliska@...e.cz>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org
Subject: Re: [PATCH 2/2] perf/x86/amd: Add support for Large Increment per
Cycle Events
On Wed, Jan 08, 2020 at 04:26:47PM -0600, Kim Phillips wrote:
> On 12/20/19 6:09 AM, Peter Zijlstra wrote:
> > On Thu, Nov 14, 2019 at 12:37:20PM -0600, Kim Phillips wrote:
> >> @@ -926,10 +944,14 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
> >> break;
> >>
> >> /* not already used */
> >> - if (test_bit(hwc->idx, used_mask))
> >> + if (test_bit(hwc->idx, used_mask) || (is_large_inc(hwc) &&
> >> + test_bit(hwc->idx + 1, used_mask)))
> >> break;
> >>
> >> __set_bit(hwc->idx, used_mask);
> >> + if (is_large_inc(hwc))
> >> + __set_bit(hwc->idx + 1, used_mask);
> >> +
> >> if (assign)
> >> assign[i] = hwc->idx;
> >> }
> >
> > This is just really sad.. fixed that too.
>
> [*]
> If I undo re-adding my perf_assign_events code, and re-add my "not
> already used" code that you removed - see [*] above - the problem DOES
> go away, and all the counts are all accurate.
>
> One problem I see with your change in the "not already used" fastpath
> area, is that the new mask variable gets updated with position 'i'
> regardless of any previous Large Increment event assignments.
Urgh, I completely messed that up. Find the below delta (I'll push out a
new version to queue.git as well).
> I.e., a
> successfully scheduled large increment event assignment may have
> already consumed that 'i' slot for its Merge event in a previous
> iteration of the loop. So if the fastpath scheduler fails to assign
> that following event, the slow path is wrongly entered due to a wrong
> 'i' comparison with 'n', for example.
That should only be part of the story though; the fast-path is purely
optional. False-negatives on the fast path should not affect
functionality, only performance. False-positives on the fast path are a
no-no of course.
---
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 222f172cbaf5..3bb738f5a472 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -937,7 +937,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
* fastpath, try to reuse previous register
*/
for (i = 0; i < n; i++) {
- u64 mask = BIT_ULL(i);
+ u64 mask;
hwc = &cpuc->event_list[i]->hw;
c = cpuc->event_constraint[i];
@@ -950,6 +950,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
if (!test_bit(hwc->idx, c->idxmsk))
break;
+ mask = BIT_ULL(hwc->idx);
if (is_counter_pair(hwc))
mask |= mask << 1;
Powered by blists - more mailing lists