lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e763b0bd-cb51-4a76-816d-e12e59b02214@linux.intel.com>
Date: Mon, 9 Jun 2025 09:48:12 -0400
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Leo Yan <leo.yan@....com>
Cc: peterz@...radead.org, mingo@...hat.com, namhyung@...nel.org,
 irogers@...gle.com, mark.rutland@....com, linux-kernel@...r.kernel.org,
 linux-perf-users@...r.kernel.org, eranian@...gle.com, ctshao@...gle.com,
 tmricht@...ux.ibm.com, Aishwarya TCV <aishwarya.tcv@....com>,
 Alexei Starovoitov <alexei.starovoitov@...il.com>,
 Venkat Rao Bagalkote <venkat88@...ux.ibm.com>,
 Vince Weaver <vincent.weaver@...ne.edu>
Subject: Re: [PATCH V4] perf: Fix the throttle error of some clock events



On 2025-06-09 8:34 a.m., Leo Yan wrote:
> Hi Kan,
> 
> On Fri, Jun 06, 2025 at 12:25:46PM -0700, kan.liang@...ux.intel.com wrote:
> 
> [...]
> 
>> There may be two ways to fix it.
>> - Introduce a PMU flag to track the case. Avoid the event_stop in
>>   perf_event_throttle() if the flag is detected.
>>   It has been implemented in the
>>   https://lore.kernel.org/lkml/20250528175832.2999139-1-kan.liang@linux.intel.com/
>>   The new flag was thought to be an overkill for the issue.
>> - Add a check in the event_stop. Return immediately if the throttle is
>>   invoked in the hrtimer handler. Rely on the existing HRTIMER_NORESTART
>>   method to stop the timer.
>>
>> The latter is implemented here.
>>
>> Move event->hw.interrupts = MAX_INTERRUPTS before the stop(). It makes
>> the order the same as perf_event_unthrottle(). Except the patch, no one
>> checks the hw.interrupts in the stop(). There is no impact from the
>> order change.
>>
>> When stops in the throttle, the event should not be updated,
>> stop(event, 0).
> 
> I am confused for this conclusion. When a CPU or task clock event is
> stopped by throttling, should it also be updated? Otherwise, we will
> lose accouting for the period prior to the throttling.
> 
> I saw you exchanged with Alexei for a soft lockup issue, the reply [1]
> shows that skipping event update on throttling does not help to
> resolve the lockup issue.
> 
> Could you elaberate why we don't need to update clock events when
> throttling?
> 

This is to follow the existing behavior before the throttling fix*.
When throttling is triggered, the stop(event, 0); will be invoked.
As my understanding, it's because the period is not changed with
throttling. So we don't need to update the period.

But if the period is changed, the update is required. You may find an
example in the perf_adjust_freq_unthr_events(). In the freq mode,
stop(event, PERF_EF_UPDATE) is actually invoked for the triggered event.

For the clock event, the existing behavior before the throttling fix* is
not to invoke the stop() in throttling. It relies on the
HRTIMER_NORESTART instead. My previous throttling fix changes the
behavior. It invokes both stop() and HRTIMER_NORESTART. Now, this patch
change the behavior back.

Regarding the soft lockup issue, it's a different issue in
virtualization. It should be an old issue which is not introduced by my
throttling fix.

Thanks,
Kan

* The throttling fix is 9734e25fbf5a ("perf: Fix the throttle logic for
a group")


> Thanks,
> Leo
> 
> [1] https://lore.kernel.org/linux-perf-users/CAADnVQKRJKsG08KkEriuBQop0LgDr+c9rkNE6MUh_n3rzZoXVQ@mail.gmail.com/
> 
>> But the cpu_clock_event_stop() doesn't handle the flag.
>> In logic, it's wrong. But it didn't bring any problems with the old
>> code, because the stop() was not invoked when handling the throttle.
>> Checking the flag before updating the event.
>>
>> Reported-by: Leo Yan <leo.yan@....com>
>> Reported-by: Aishwarya TCV <aishwarya.tcv@....com>
>> Closes: https://lore.kernel.org/lkml/20250527161656.GJ2566836@e132581.arm.com/
>> Reported-by: Alexei Starovoitov <alexei.starovoitov@...il.com>
>> Closes: https://lore.kernel.org/lkml/djxlh5fx326gcenwrr52ry3pk4wxmugu4jccdjysza7tlc5fef@ktp4rffawgcw/
>> Reported-by: Venkat Rao Bagalkote <venkat88@...ux.ibm.com>
>> Closes: https://lore.kernel.org/lkml/8e8f51d8-af64-4d9e-934b-c0ee9f131293@linux.ibm.com/
>> Reported-by: Vince Weaver <vincent.weaver@...ne.edu>
>> Closes: https://lore.kernel.org/lkml/4ce106d0-950c-aadc-0b6a-f0215cd39913@maine.edu/
>> Reviewed-by: Ian Rogers <irogers@...gle.com>
>> Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
>> ---
>>
>> The patch is to fix the issue introduced by
>>
>>   9734e25fbf5a perf: Fix the throttle logic for a group
>>
>> It is still in the tip.git, I'm not sure if the commit ID is valid. So
>> the Fixes tag is not added.
>>
>> There are some discussions regarding to a soft lockup issue.
>> That is an existing issue, which are not introduced by the above change.
>> It should be fixed separately.
>> https://lore.kernel.org/lkml/CAADnVQ+Lx0HWEM8xdLC80wco3BTUPAD_2dQ-3oZFiECZMcw2aQ@mail.gmail.com/
>>
>> Changes since V3:
>> - Check before update in event_stop()
>> - Add Reviewed-by from Ian
>>
>>  kernel/events/core.c | 15 +++++++++++----
>>  1 file changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index f34c99f8ce8f..cc77f127e11a 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -2656,8 +2656,8 @@ static void perf_event_unthrottle(struct perf_event *event, bool start)
>>  
>>  static void perf_event_throttle(struct perf_event *event)
>>  {
>> -	event->pmu->stop(event, 0);
>>  	event->hw.interrupts = MAX_INTERRUPTS;
>> +	event->pmu->stop(event, 0);
>>  	if (event == event->group_leader)
>>  		perf_log_throttle(event, 0);
>>  }
>> @@ -11749,7 +11749,12 @@ static void perf_swevent_cancel_hrtimer(struct perf_event *event)
>>  {
>>  	struct hw_perf_event *hwc = &event->hw;
>>  
>> -	if (is_sampling_event(event)) {
>> +	/*
>> +	 * The throttle can be triggered in the hrtimer handler.
>> +	 * The HRTIMER_NORESTART should be used to stop the timer,
>> +	 * rather than hrtimer_cancel(). See perf_swevent_hrtimer()
>> +	 */
>> +	if (is_sampling_event(event) && (hwc->interrupts != MAX_INTERRUPTS)) {
>>  		ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer);
>>  		local64_set(&hwc->period_left, ktime_to_ns(remaining));
>>  
>> @@ -11804,7 +11809,8 @@ static void cpu_clock_event_start(struct perf_event *event, int flags)
>>  static void cpu_clock_event_stop(struct perf_event *event, int flags)
>>  {
>>  	perf_swevent_cancel_hrtimer(event);
>> -	cpu_clock_event_update(event);
>> +	if (flags & PERF_EF_UPDATE)
>> +		cpu_clock_event_update(event);
>>  }
>>  
>>  static int cpu_clock_event_add(struct perf_event *event, int flags)
>> @@ -11882,7 +11888,8 @@ static void task_clock_event_start(struct perf_event *event, int flags)
>>  static void task_clock_event_stop(struct perf_event *event, int flags)
>>  {
>>  	perf_swevent_cancel_hrtimer(event);
>> -	task_clock_event_update(event, event->ctx->time);
>> +	if (flags & PERF_EF_UPDATE)
>> +		task_clock_event_update(event, event->ctx->time);
>>  }
>>  
>>  static int task_clock_event_add(struct perf_event *event, int flags)
>> -- 
>> 2.38.1
>>
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ