lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <38ceaabe-0a2c-43f2-8f04-b93215f1f94c@huaweicloud.com>
Date: Thu, 5 Sep 2024 14:38:10 +0800
From: Luo Gengkun <luogengkun@...weicloud.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...hat.com, acme@...nel.org, namhyung@...nel.org,
 mark.rutland@....com, alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
 irogers@...gle.com, adrian.hunter@...el.com, kan.liang@...ux.intel.com,
 linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
 stable@...r.kernel.org
Subject: Re: [PATCH v5 2/2] perf/core: Fix incorrect time diff in tick adjust
 period


On 2024/9/2 17:50, Peter Zijlstra wrote:
> On Sat, Aug 31, 2024 at 07:43:16AM +0000, Luo Gengkun wrote:
>> Perf events has the notion of sampling frequency which is implemented in
>> software by dynamically adjusting the counter period so that samples occur
>> at approximately the target frequency.  Period adjustment is done in 2
>> places:
>>   - when the counter overflows (and a sample is recorded)
>>   - each timer tick, when the event is active
>> The later case is slightly flawed because it assumes that the time since
>> the last timer-tick period adjustment is 1 tick, whereas the event may not
>> have been active (e.g. for a task that is sleeping).
>>
>> Fix by using jiffies to determine the elapsed time in that case.
>>
>> Signed-off-by: Luo Gengkun <luogengkun@...weicloud.com>
>> Reviewed-by: Adrian Hunter <adrian.hunter@...el.com>
>> ---
>>   include/linux/perf_event.h |  1 +
>>   kernel/events/core.c       | 12 +++++++++---
>>   2 files changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 1a8942277dda..d29b7cf971a1 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -265,6 +265,7 @@ struct hw_perf_event {
>>   	 * State for freq target events, see __perf_event_overflow() and
>>   	 * perf_adjust_freq_unthr_context().
>>   	 */
>> +	u64				freq_tick_stamp;
>>   	u64				freq_time_stamp;
>>   	u64				freq_count_stamp;
>>   #endif
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index a9395bbfd4aa..183291e0d070 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -55,6 +55,7 @@
>>   #include <linux/pgtable.h>
>>   #include <linux/buildid.h>
>>   #include <linux/task_work.h>
>> +#include <linux/jiffies.h>
>>   
>>   #include "internal.h"
>>   
>> @@ -4120,9 +4121,11 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
>>   {
>>   	struct perf_event *event;
>>   	struct hw_perf_event *hwc;
>> -	u64 now, period = TICK_NSEC;
>> +	u64 now, period, tick_stamp;
>>   	s64 delta;
>>   
>> +	tick_stamp = jiffies64_to_nsecs(get_jiffies_64());
>> +
>>   	list_for_each_entry(event, event_list, active_list) {
>>   		if (event->state != PERF_EVENT_STATE_ACTIVE)
>>   			continue;
>> @@ -4148,6 +4151,9 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
>>   		 */
>>   		event->pmu->stop(event, PERF_EF_UPDATE);
>>   
>> +		period = tick_stamp - hwc->freq_tick_stamp;
>> +		hwc->freq_tick_stamp = tick_stamp;
>> +
>>   		now = local64_read(&event->count);
>>   		delta = now - hwc->freq_count_stamp;
>>   		hwc->freq_count_stamp = now;
>> @@ -4157,9 +4163,9 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
>>   		 * reload only if value has changed
>>   		 * we have stopped the event so tell that
>>   		 * to perf_adjust_period() to avoid stopping it
>> -		 * twice.
>> +		 * twice. And skip if it is the first tick adjust period.
>>   		 */
>> -		if (delta > 0)
>> +		if (delta > 0 && likely(period != tick_stamp))
>>   			perf_adjust_period(event, period, delta, false);
>>   
>>   		event->pmu->start(event, delta > 0 ? PERF_EF_RELOAD : 0);
> This one I'm less happy with.. that condition 'period != tick_stamp'
> doesn't make sense to me. That's only false if hwc->freq_tick_stamp ==
> 0, which it will only be once after event creation. Even through the
> Changelog babbles about event scheduling.
>
> Also, that all should then be written something like:
>
> 	if (delta > 0 && ...) {
> 		perf_adjust_period(...);
> 		adjusted = true;
> 	}
>
> 	event->pmu->start(event, adjusted ? PERF_EF_RELOAD : 0);

Thank for your review! That is a good point.

If freq_tick_stamp is initialized when an event is created

or enabled, the additional condition can be removed as follows:

+static bool is_freq_event(struct perf_event *event)
+{
+       return event->attr.freq && event->attr.sample_freq;
+}
+
  static void
  perf_event_set_state(struct perf_event *event, enum perf_event_state 
state)
  {
@@ -665,6 +670,12 @@ perf_event_set_state(struct perf_event *event, enum 
perf_event_state state)
          */
         if ((event->state < 0) ^ (state < 0))
                 perf_event_update_sibling_time(event);
+       /*
+        * Update freq_tick_stamp for freq event just enabled
+        */
+       if (is_freq_event(event) && state == PERF_EVENT_STATE_INACTIVE &&
+                                   event->state < 
PERF_EVENT_STATE_INACTIVE)
+               event->hw.freq_tick_stamp = 
jiffies64_to_nsecs(get_jiffies_64());

         WRITE_ONCE(event->state, state);
  }
@@ -4165,7 +4176,7 @@ static void perf_adjust_freq_unthr_events(struct 
list_head *event_list)
                  * to perf_adjust_period() to avoid stopping it
                  * twice. And skip if it is the first tick adjust period.
                  */
-               if (delta > 0 && likely(period != tick_stamp))
+               if (delta > 0)
                         perf_adjust_period(event, period, delta, false);

                 event->pmu->start(event, delta > 0 ? PERF_EF_RELOAD : 0);
@@ -12061,8 +12072,11 @@ perf_event_alloc(struct perf_event_attr *attr, 
int cpu,

         hwc = &event->hw;
         hwc->sample_period = attr->sample_period;
-       if (attr->freq && attr->sample_freq)
+       if (is_freq_event(event)) {
                 hwc->sample_period = 1;
+               if (event->state == PERF_EVENT_STATE_INACTIVE)
+                       event->hw.freq_tick_stamp = 
jiffies64_to_nsecs(get_jiffies_64());
+       }


And  I'm wondering if we also need to update freq_count_stamp when

the freq event is enabled for the reason to keep they on the same "period".

+       if (is_freq_event(event) && state == PERF_EVENT_STATE_INACTIVE &&
+                                   event->state < 
PERF_EVENT_STATE_INACTIVE) {
+               event->hw.freq_tick_stamp = 
jiffies64_to_nsecs(get_jiffies_64());
+               event->hw.freq_count_stamp = local64_read(&event->count);
+       }

Looking for your reply!

Thanks.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ