[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160129081817.GB28282@hr-amur2>
Date: Fri, 29 Jan 2016 16:18:33 +0800
From: Huang Rui <ray.huang@....com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Borislav Petkov <bp@...en8.de>, Borislav Petkov <bp@...e.de>,
Ingo Molnar <mingo@...nel.org>,
Andy Lutomirski <luto@...capital.net>,
Thomas Gleixner <tglx@...utronix.de>,
Robert Richter <rric@...nel.org>,
Jacob Shin <jacob.w.shin@...il.com>,
John Stultz <john.stultz@...aro.org>,
Fr�d�ric Weisbecker <fweisbec@...il.com>,
<linux-kernel@...r.kernel.org>, <spg_linux_kernel@....com>,
<x86@...nel.org>, Guenter Roeck <linux@...ck-us.net>,
Andreas Herrmann <herrmann.der.user@...glemail.com>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
Aravind Gopalakrishnan <Aravind.Gopalakrishnan@....com>,
Fengguang Wu <fengguang.wu@...el.com>,
Aaron Lu <aaron.lu@...el.com>
Subject: Re: [PATCH v4] perf/x86/amd/power: Add AMD accumulated power
reporting mechanism
On Thu, Jan 28, 2016 at 04:28:48PM +0100, Peter Zijlstra wrote:
> On Thu, Jan 28, 2016 at 10:03:15AM +0100, Borislav Petkov wrote:
>
> > +
> > +struct power_pmu {
> > + raw_spinlock_t lock;
>
> Now that the list is gone, what does this thing protect?
>
Protect the event count value before measure it.
> > + struct pmu *pmu;
>
> This member seems superfluous, there's only the one possible value.
>
Currently, it's only one. But there will be more power pmu types in
future processors. Acc power is one of them.
> > + local64_t cpu_sw_pwr_ptsc;
> > +
> > + /*
> > + * These two cpumasks are used for avoiding the allocations on the
> > + * CPU_STARTING phase because power_cpu_prepare() will be called with
> > + * IRQs disabled.
> > + */
> > + cpumask_var_t mask;
> > + cpumask_var_t tmp_mask;
> > +};
> > +
> > +static struct pmu pmu_class;
> > +
> > +/*
> > + * Accumulated power represents the sum of each compute unit's (CU) power
> > + * consumption. On any core of each CU we read the total accumulated power from
> > + * MSR_F15H_CU_PWR_ACCUMULATOR. cpu_mask represents CPU bit map of all cores
> > + * which are picked to measure the power for the CUs they belong to.
> > + */
> > +static cpumask_t cpu_mask;
> > +
> > +static DEFINE_PER_CPU(struct power_pmu *, amd_power_pmu);
> > +
> > +static u64 event_update(struct perf_event *event, struct power_pmu *pmu)
> > +{
>
> Is there ever a case where @pmu != __this_cpu_read(power_pmu) ?
>
It only might be called at pmu:{read, stop}, they ensure
__this_cpu_read(amd_power_pmu). Is there any other case I missed?
> > + struct hw_perf_event *hwc = &event->hw;
> > + u64 prev_raw_count, new_raw_count, prev_ptsc, new_ptsc;
> > + u64 delta, tdelta;
> > +
> > +again:
> > + prev_raw_count = local64_read(&hwc->prev_count);
> > + prev_ptsc = local64_read(&pmu->cpu_sw_pwr_ptsc);
> > + rdmsrl(event->hw.event_base, new_raw_count);
>
> Is hw.event_base != MSR_F15H_CU_PWR_ACCUMULATOR possible?
>
Any case that I missed?
Could you explain more?
> > + rdmsrl(MSR_F15H_PTSC, new_ptsc);
>
>
> Also, I suspect this doesn't do what you expect it to do.
>
> We measure per-event PWR_ACC deltas, but per CPU PTSC values. These do
> not match when there's more than 1 event on the CPU.
>
OK, I see. My intention of pre-event's count (event->count) should be
PWR_ACC values after divided by PTSC. But here we cannot use
local64_read(&hwc->prev_count) as previous value of PWR_ACC before
divided by PTSC. Thanks to catch it.
> I would suggest adding a new struct to the hw_perf_event union with the
> two u64 deltas like:
>
> struct { /* amd_power */
> u64 pwr_acc;
> u64 ptsc;
> };
>
> And track these values per-event.
>
Thanks to reminder.
Thanks,
Rui
Powered by blists - more mailing lists