linux-kernel - Re: [PATCH V4] perf: Fix the throttle error of some clock events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250610121300.GR8020@e132581.arm.com>
Date: Tue, 10 Jun 2025 13:13:00 +0100
From: Leo Yan <leo.yan@....com>
To: "Liang, Kan" <kan.liang@...ux.intel.com>
Cc: peterz@...radead.org, mingo@...hat.com, namhyung@...nel.org,
	irogers@...gle.com, mark.rutland@....com,
	linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
	eranian@...gle.com, ctshao@...gle.com, tmricht@...ux.ibm.com,
	Aishwarya TCV <aishwarya.tcv@....com>,
	Alexei Starovoitov <alexei.starovoitov@...il.com>,
	Venkat Rao Bagalkote <venkat88@...ux.ibm.com>,
	Vince Weaver <vincent.weaver@...ne.edu>
Subject: Re: [PATCH V4] perf: Fix the throttle error of some clock events

On Mon, Jun 09, 2025 at 03:59:41PM -0400, Liang, Kan wrote:

[...]

> >> When throttling is triggered, the stop(event, 0); will be invoked.
> >> As my understanding, it's because the period is not changed with
> >> throttling. So we don't need to update the period.
> > 
> >> But if the period is changed, the update is required. You may find an
> >> example in the perf_adjust_freq_unthr_events(). In the freq mode,
> >> stop(event, PERF_EF_UPDATE) is actually invoked for the triggered event.
> > 
> >> For the clock event, the existing behavior before the throttling fix* is
> >> not to invoke the stop() in throttling. It relies on the
> >> HRTIMER_NORESTART instead. My previous throttling fix changes the
> >> behavior. It invokes both stop() and HRTIMER_NORESTART. Now, this patch
> >> change the behavior back.
> > 
> > Actually, the "event->count" has been updated in perf_swevent_hrtimer(),
> > this is why this patch does not cause big deviation if skip updating
> > count in the ->stop() callback:
> > >   perf_swevent_hrtimer()
> >    ` event->pmu->read(event);               => Update count
> >    ` __perf_event_overflow()
> >       ` perf_event_throttle()
> >          ` event->pmu->stop(event, 0) / cpu_clock_event_stop()
> >             ` perf_swevent_cancel_hrtimer() => Skip to cancel timer
> >             ` task_clock_event_update()     => Skip to update count
> >    ` return HRTIMER_NORESTART;              => Stop timer
> > 
> > It is a bit urgly that we check the throttling separately in two
> > places: one is in perf_swevent_cancel_hrtime() for skipping cancel
> > timer, and then we skip updating event count in
> > cpu_clock_event_stop().
> 
> The second check before cpu_clock_event_stop() is not a throttling
> check. It's to implement the missed flag check.
> Usually, the stop() should check PERF_EF_UPDATE before updating an
> event. I think most of the ARCHs do so.
> Some cases may ignore the flag. For the clock event, I think it's
> because the stop(event, 0) is never invoked. So it doesn't matter if the
> flag is checked. But now, there is a case which the flag matters.
> So I think we should add the flag check.
> 
> > 
> > One solution is it would be fine to update count in ->stop() callback
> > for the throttling. This should not cause any issue (though it is a bit
> > redundant that the count is updated twice).
> 
> The clock event relies on local_clock(), which never stops.

Ah, good point!

> So it still counts between read() and stop().
> It's not just redundant. The behavior is changed if the event is updated
> in the stop() again.

> > Or even more clear, we can define a flag PERF_EF_THROTTLING:
> > 
> >     #define PERF_EF_THROTTLING  0x20
> > 
> >     event->pmu->stop(event, PERF_EF_THROTTLING);
> > 
> 
> The if (hwc->interrupts != MAX_INTERRUPTS) should be good enough to
> check the throttling case. I don't think we need a new flag here.

Makes sense to me.

Thanks,
Leo

> >     cpu_clock_event_stop(struct perf_event *event, int flags)
> >     {
> >         if (flags == PERF_EF_THROTTLING)
> >             return;
> > 
> >         ....
> >     }
> > 
> > This might need to do a wider checking to ensure this new flags will not
> > cause any issues.
> 
> Right, it may brings more troubles.
> 
> I think we should properly utilize the existing flag rather than
> introducing a new one.
> 
> Thanks,
> Kan
>