linux-kernel - Re: [PATCH v4 2/2] perf/core: Fix incorrect time diff in tick adjust period

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c1b2a1b6-8b6d-40b4-84d4-d007c024fc84@intel.com>
Date: Tue, 27 Aug 2024 20:16:15 +0300
From: Adrian Hunter <adrian.hunter@...el.com>
To: "Liang, Kan" <kan.liang@...ux.intel.com>,
 Luo Gengkun <luogengkun@...weicloud.com>, peterz@...radead.org
Cc: mingo@...hat.com, acme@...nel.org, namhyung@...nel.org,
 mark.rutland@....com, alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
 irogers@...gle.com, linux-perf-users@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/2] perf/core: Fix incorrect time diff in tick adjust
 period

On 27/08/24 19:42, Liang, Kan wrote:
> 
> 
> On 2024-08-21 9:42 a.m., Luo Gengkun wrote:
>> Perf events has the notion of sampling frequency which is implemented in
>> software by dynamically adjusting the counter period so that samples occur
>> at approximately the target frequency.  Period adjustment is done in 2
>> places:
>>  - when the counter overflows (and a sample is recorded)
>>  - each timer tick, when the event is active
>> The later case is slightly flawed because it assumes that the time since
>> the last timer-tick period adjustment is 1 tick, whereas the event may not
>> have been active (e.g. for a task that is sleeping).
>>
> 
> Do you have a real-world example to demonstrate how bad it is if the
> algorithm doesn't take sleep into account?
> 
> I'm not sure if introducing such complexity in the critical path is
> worth it.
> 
>> Fix by using jiffies to determine the elapsed time in that case.
>>
>> Signed-off-by: Luo Gengkun <luogengkun@...weicloud.com>
>> ---
>>  include/linux/perf_event.h |  1 +
>>  kernel/events/core.c       | 11 ++++++++---
>>  2 files changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 1a8942277dda..d29b7cf971a1 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -265,6 +265,7 @@ struct hw_perf_event {
>>  	 * State for freq target events, see __perf_event_overflow() and
>>  	 * perf_adjust_freq_unthr_context().
>>  	 */
>> +	u64				freq_tick_stamp;
>>  	u64				freq_time_stamp;
>>  	u64				freq_count_stamp;
>>  #endif
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index a9395bbfd4aa..86e80e3ef6ac 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -55,6 +55,7 @@
>>  #include <linux/pgtable.h>
>>  #include <linux/buildid.h>
>>  #include <linux/task_work.h>
>> +#include <linux/jiffies.h>
>>  
>>  #include "internal.h"
>>  
>> @@ -4120,7 +4121,7 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
>>  {
>>  	struct perf_event *event;
>>  	struct hw_perf_event *hwc;
>> -	u64 now, period = TICK_NSEC;
>> +	u64 now, period, tick_stamp;
>>  	s64 delta;
>>  
>>  	list_for_each_entry(event, event_list, active_list) {
>> @@ -4148,6 +4149,10 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
>>  		 */
>>  		event->pmu->stop(event, PERF_EF_UPDATE);
>>  
>> +		tick_stamp = jiffies64_to_nsecs(get_jiffies_64());
> 
> Seems it only needs to retrieve the time once at the beginning, not for
> each event.
> 
> There is a perf_clock(). It's better to use it for the consistency.

perf_clock() is much slower, and for statistical sampling it doesn't
have to be perfect.

> 
> Thanks,
> Kan
>> +		period = tick_stamp - hwc->freq_tick_stamp;
>> +		hwc->freq_tick_stamp = tick_stamp;
>> +
>>  		now = local64_read(&event->count);
>>  		delta = now - hwc->freq_count_stamp;
>>  		hwc->freq_count_stamp = now;
>> @@ -4157,9 +4162,9 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
>>  		 * reload only if value has changed
>>  		 * we have stopped the event so tell that
>>  		 * to perf_adjust_period() to avoid stopping it
>> -		 * twice.
>> +		 * twice. And skip if it is the first tick adjust period.
>>  		 */
>> -		if (delta > 0)
>> +		if (delta > 0 && likely(period != tick_stamp))
>>  			perf_adjust_period(event, period, delta, false);>
>>  		event->pmu->start(event, delta > 0 ? PERF_EF_RELOAD : 0);