lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 3 Aug 2021 13:00:38 -0400
From:   "Liang, Kan" <kan.liang@...ux.intel.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     mingo@...hat.com, linux-kernel@...r.kernel.org, ak@...ux.intel.com,
        stable@...r.kernel.org
Subject: Re: [PATCH V2] perf/x86/intel: Apply mid ACK for small core



On 8/3/2021 12:17 PM, Peter Zijlstra wrote:
> On Tue, Aug 03, 2021 at 11:20:20AM -0400, Liang, Kan wrote:
>>
>>
>> On 8/3/2021 10:55 AM, Peter Zijlstra wrote:
>>> On Tue, Aug 03, 2021 at 06:25:28AM -0700, kan.liang@...ux.intel.com wrote:
>>>> From: Kan Liang <kan.liang@...ux.intel.com>
>>>>
>>>> A warning as below may be occasionally triggered in an ADL machine when
>>>> these conditions occur,
>>>> - Two perf record commands run one by one. Both record a PEBS event.
>>>> - Both runs on small cores.
>>>> - They have different adaptive PEBS configuration (PEBS_DATA_CFG).
>>>>
>>>> [  673.663291] WARNING: CPU: 4 PID: 9874 at
>>>> arch/x86/events/intel/ds.c:1743
>>>> setup_pebs_adaptive_sample_data+0x55e/0x5b0
>>>> [  673.663348] RIP: 0010:setup_pebs_adaptive_sample_data+0x55e/0x5b0
>>>> [  673.663357] Call Trace:
>>>> [  673.663357]  <NMI>
>>>> [  673.663357]  intel_pmu_drain_pebs_icl+0x48b/0x810
>>>> [  673.663360]  perf_event_nmi_handler+0x41/0x80
>>>> [  673.663368]  </NMI>
>>>> [  673.663370]  __perf_event_task_sched_in+0x2c2/0x3a0
>>>>
>>>> Different from the big core, the small core requires the ACK right
>>>> before re-enabling counters in the NMI handler, otherwise a stale PEBS
>>>> record may be dumped into the later NMI handler, which trigger the
>>>> warning.
>>>>
>>>> Add a new mid_ack flag to track the case. Add all PMI handler bits in
>>>> the struct x86_hybrid_pmu to track the bits for different types of PMUs.
>>>> Apply mid ACK for the small cores on an Alder Lake machine.
>>>
>>> Why do we need a new option? Why isn't early (as in not late) good
>>> enough?
>>>
>>
>> The early ACK can fix this issue, however it triggers a spurious NMI during
>> the stress test. I'm told to do the ACK right before re-enabling counters
>> for small cores. That indeed fixes all the issues.
> 
> Any chance that would also work for the chips that now use late_ack?
>

Let me check and do some tests.

Thanks,
Kan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ