linux-kernel - Re: [PATCH V4] perf: Fix the throttle error of some clock events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250609183604.GP8020@e132581.arm.com>
Date: Mon, 9 Jun 2025 19:36:04 +0100
From: Leo Yan <leo.yan@....com>
To: "Liang, Kan" <kan.liang@...ux.intel.com>
Cc: peterz@...radead.org, mingo@...hat.com, namhyung@...nel.org,
	irogers@...gle.com, mark.rutland@....com,
	linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
	eranian@...gle.com, ctshao@...gle.com, tmricht@...ux.ibm.com,
	Aishwarya TCV <aishwarya.tcv@....com>,
	Alexei Starovoitov <alexei.starovoitov@...il.com>,
	Venkat Rao Bagalkote <venkat88@...ux.ibm.com>,
	Vince Weaver <vincent.weaver@...ne.edu>
Subject: Re: [PATCH V4] perf: Fix the throttle error of some clock events

On Mon, Jun 09, 2025 at 09:48:12AM -0400, Liang, Kan wrote:

[...]

> >> Move event->hw.interrupts = MAX_INTERRUPTS before the stop(). It makes
> >> the order the same as perf_event_unthrottle(). Except the patch, no one
> >> checks the hw.interrupts in the stop(). There is no impact from the
> >> order change.
> >>
> >> When stops in the throttle, the event should not be updated,
> >> stop(event, 0).
> > 
> > I am confused for this conclusion. When a CPU or task clock event is
> > stopped by throttling, should it also be updated? Otherwise, we will
> > lose accouting for the period prior to the throttling.
> > 
> > I saw you exchanged with Alexei for a soft lockup issue, the reply [1]
> > shows that skipping event update on throttling does not help to
> > resolve the lockup issue.
> > 
> > Could you elaberate why we don't need to update clock events when
> > throttling?
> > 
> 
> This is to follow the existing behavior before the throttling fix*.
>
> When throttling is triggered, the stop(event, 0); will be invoked.
> As my understanding, it's because the period is not changed with
> throttling. So we don't need to update the period.

> But if the period is changed, the update is required. You may find an
> example in the perf_adjust_freq_unthr_events(). In the freq mode,
> stop(event, PERF_EF_UPDATE) is actually invoked for the triggered event.

> For the clock event, the existing behavior before the throttling fix* is
> not to invoke the stop() in throttling. It relies on the
> HRTIMER_NORESTART instead. My previous throttling fix changes the
> behavior. It invokes both stop() and HRTIMER_NORESTART. Now, this patch
> change the behavior back.

Actually, the "event->count" has been updated in perf_swevent_hrtimer(),
this is why this patch does not cause big deviation if skip updating
count in the ->stop() callback:

  perf_swevent_hrtimer()
   ` event->pmu->read(event);               => Update count
   ` __perf_event_overflow()
      ` perf_event_throttle()
         ` event->pmu->stop(event, 0) / cpu_clock_event_stop()
            ` perf_swevent_cancel_hrtimer() => Skip to cancel timer
            ` task_clock_event_update()     => Skip to update count
   ` return HRTIMER_NORESTART;              => Stop timer

It is a bit urgly that we check the throttling separately in two
places: one is in perf_swevent_cancel_hrtime() for skipping cancel
timer, and then we skip updating event count in
cpu_clock_event_stop().

One solution is it would be fine to update count in ->stop() callback
for the throttling. This should not cause any issue (though it is a bit
redundant that the count is updated twice).

Or even more clear, we can define a flag PERF_EF_THROTTLING:

    #define PERF_EF_THROTTLING  0x20

    event->pmu->stop(event, PERF_EF_THROTTLING);

    cpu_clock_event_stop(struct perf_event *event, int flags)
    {
        if (flags == PERF_EF_THROTTLING)
            return;

        ....
    }

This might need to do a wider checking to ensure this new flags will not
cause any issues.

Thanks,
Leo