linux-kernel - Re: [PATCH v3 4/4] perf,x86: add RAPL hrtimer support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABPqkBTKec0T0NDFUoC8qDVO9stACkzc2xnUqwx6yODHXKaYTg@mail.gmail.com>
Date:	Sat, 26 Oct 2013 19:07:06 +0200
From:	Stephane Eranian <eranian@...gle.com>
To:	Jiri Olsa <jolsa@...hat.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	"mingo@...e.hu" <mingo@...e.hu>,
	"ak@...ux.intel.com" <ak@...ux.intel.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	"Yan, Zheng" <zheng.z.yan@...el.com>,
	Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH v3 4/4] perf,x86: add RAPL hrtimer support

On Fri, Oct 25, 2013 at 7:44 PM, Jiri Olsa <jolsa@...hat.com> wrote:
> On Wed, Oct 23, 2013 at 02:58:05PM +0200, Stephane Eranian wrote:
>> The RAPL PMU counters do not interrupt on overflow.
>> Therefore, the kernel needs to poll the counters
>> to avoid missing an overflow. This patch adds
>> the hrtimer code to do this.
>>
>> The timer internval is calculated at boot time
>> based on the power unit used by the HW.
>>
>> Signed-off-by: Stephane Eranian <eranian@...gle.com>
>> ---
>>  arch/x86/kernel/cpu/perf_event_intel_rapl.c |   75 +++++++++++++++++++++++++--
>>  1 file changed, 70 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
>> index 3d71d39..ed0566a 100644
>> --- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
>> +++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
>> @@ -92,11 +92,13 @@ static struct kobj_attribute format_attr_##_var =         \
>>
>>  struct rapl_pmu {
>>       spinlock_t      lock;
>> -     atomic_t        refcnt;
>>       int             hw_unit;  /* 1/2^hw_unit Joule */
>> -     int             phys_id;
>> -     int             n_active; /* number of active events */
>> +     struct hrtimer  hrtimer;
>>       struct list_head active_list;
>> +     ktime_t         timer_interval; /* in ktime_t unit */
>> +     int             n_active; /* number of active events */
>> +     int             phys_id;
>> +     atomic_t        refcnt;
>>  };
>>
>>  static struct pmu rapl_pmu_class;
>> @@ -161,6 +163,47 @@ static u64 rapl_event_update(struct perf_event *event)
>>       return new_raw_count;
>>  }
>>
>> +static void rapl_start_hrtimer(struct rapl_pmu *pmu)
>> +{
>> +     __hrtimer_start_range_ns(&pmu->hrtimer,
>> +                     pmu->timer_interval, 0,
>> +                     HRTIMER_MODE_REL_PINNED, 0);
>> +}
>> +
>> +static void rapl_stop_hrtimer(struct rapl_pmu *pmu)
>> +{
>> +     hrtimer_cancel(&pmu->hrtimer);
>> +}
>> +
>> +static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
>> +{
>> +     struct rapl_pmu *pmu = container_of(hrtimer, struct rapl_pmu, hrtimer);
>> +     struct perf_event *event;
>> +     unsigned long flags;
>> +
>> +     if (!pmu->n_active)
>> +             return HRTIMER_NORESTART;
>> +
>> +     spin_lock_irqsave(&pmu->lock, flags);
>> +
>> +     list_for_each_entry(event, &pmu->active_list, active_entry) {
>> +             rapl_event_update(event);
>> +     }
>
> hi,
> I dont fully understand the reason for the timer,
> I'm probably missing something..
>
The reason is rather simple and is similar to what happens with uncore.
The counter are narrow, 32-bit and there is no interrupt capability. We
need to poll the counters and accumulate in the sw counter to avoid missing
an overflow.

> - the timer calls rapl_event_update for all defined events

No, only for the defined RAPL events which is what we want.

> - but rapl_pmu_event_read calls rapl_event_update any time the
>   event is read (sys_read)
>
Yes, but we want to prevent missing a counter overflow. It may happen
if the counter counts in a unit which increments fast.

> The rapl_event_update only read msr and updates
> event->count|hw,prev_count.
No, it does update the count:
        local64_add(sdelta, &event->count);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/