[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sev7nom4.ffs@tglx>
Date: Wed, 14 Aug 2024 16:52:19 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Li Huafei <lihuafei1@...wei.com>, peterz@...radead.org, mingo@...hat.com
Cc: acme@...nel.org, namhyung@...nel.org, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org, irogers@...gle.com,
adrian.hunter@...el.com, kan.liang@...ux.intel.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf/x86/intel: Restrict period on Haswell
Li!
On Tue, Aug 13 2024 at 21:13, Li Huafei wrote:
> On 2024/8/1 3:20, Thomas Gleixner wrote:
>>> My machine has 32 physical cores, each with two logical cores. During
>>> testing, it executes the CVE-2015-3290 test case 100 times concurrently.
>>>
>>> This warning was already present in [1] and a patch was given there to
>>> limit period to 128 on Haswell, but that patch was not merged into the
>>> mainline. In [2] the period on Nehalem was limited to 32. I tested 16
>>> and 32 period on my machine and found that the problem could be
>>> reproduced with a limit of 16, but the problem did not reproduce when
>>> set to 32. It looks like we can limit the cycles to 32 on Haswell as
>>> well.
>>
>> It looks like? Either it works or not.
>
> It worked for my test scenario. I say "looks like" because I'm not sure
> how it circumvents the problem, and if the limit of 32 no longer works
> if I increase the number of test cases executed in parallel. Any
> suggestions?
If you read back through the email history of these limits, then you can
see that too short periods cause that problem on Broadwell due to a
erratum, which is explained on top of the BDW limit.
Now looking at the HSW specification update specifically erratum HSW11:
Performance Monitor Precise Instruction Retired Event May Present
Wrong Indications
Problem:
When the Precise Distribution for Instructions Retired (PDIR)
mechanism is activated (INST_RETIRED.ALL (event C0H, umask
value 00H) on Counter 1 programmed in PEBS mode), the processor
may return wrong PEBS or Performance Monitoring Interrupt (PMI)
interrupts and/or incorrect counter values if the counter is
reset with a Sample- After-Value (SAV) below 100 (the SAV is
the counter reset value software programs in the MSR
IA32_PMC1[47:0] in order to control interrupt frequency).
Implication:
Due to this erratum, when using low SAV values, the program may
get incorrect PEBS or PMI interrupts and/or an invalid counter
state.
Workaround:
The sampling driver should avoid using SAV<100.
IOW, that's exactly the same issue as the BDM11 erratum.
Kan: Can you please go through the various specification updates and
identify which generations are affected by this and fix it once and
forever in a sane way instead of relying on 'tried until it works by
some definition of works' hacks. These errata are there for a reason.
But that does not explain the fallout with that cve test because that
does not use PEBS. It's using fixed counter 0.
Li, you added that huge useless backtrace but cut off the output of
perf_event_print_debug() after it. Can you please provide that
information so we can see what the counter states are?
>>> +static void hsw_limit_period(struct perf_event *event, s64 *left)
>>> +{
>>> + *left = max(*left, 32LL);
>>> +}
>>
>> And why do we need a copy of nhm_limit_period() ?
>>
>
> Do you mean why the period is limited to 32 like nhm_limit_period()?
No. If 32 is the correct limit, then we don't need another function
which does exactly the same. So you can assign exactly that function for
HSW, no?
Thanks,
tglx
Powered by blists - more mailing lists