lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201003161159.24424.trenn@suse.de>
Date:	Tue, 16 Mar 2010 11:59:24 +0100
From:	Thomas Renninger <trenn@...e.de>
To:	Robert Schöne <robert.schoene@...dresden.de>
Cc:	Arjan van de Ven <arjan@...ux.intel.com>,
	Dave Jones <davej@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"linux-kernel" <linux-kernel@...r.kernel.org>,
	cpufreq <cpufreq@...r.kernel.org>, x86@...nel.org
Subject: Re: [PATCH] trace power_frequency events on the correct cpu  (for Intel x86 CPUs)

On Tuesday 16 March 2010 08:13:48 Robert Schöne wrote:
> Am Montag, den 15.03.2010, 11:51 +0100 schrieb Thomas Renninger:
> > On Friday 12 March 2010 16:41:46 Robert Schöne wrote:
> > > Am Freitag, den 12.03.2010, 06:52 -0800 schrieb Arjan van de Ven:
> > > > On 3/12/2010 5:17, Robert Schöne wrote:
> > > > > This patch fixes the following behaviour:
> > > > > Currently, the power_frequency event is reported for the cpu (core) which initiated the frequency change.
> > > > > It should be reported for the cpu that actually changes its frequency.
> > > > >
> > > > > Example: when using
> > > > >   taskset -c 0 echo<new_frequency>  >  /sys/devices/system/cpu/cpu1/cpufreq/scaling_setspeed
> > > > > cpu 0 is traced, instead of cpu 1
> > > > >
> > > > > Signed of by Robert Schoene<robert.schoene@...dresden.de>
> > > > >
> > > > >
> > > > > diff --git a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
> > > > > index 1b1920f..0a47f10 100644
> > > > > --- a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
> > > > > +++ b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
> > > > > @@ -174,6 +174,7 @@ static void do_drv_write(void *_cmd)
> > > > >
> > > > >          switch (cmd->type) {
> > > > >          case SYSTEM_INTEL_MSR_CAPABLE:
> > > > > +               trace_power_frequency(POWER_PSTATE, cmd->val);
> > > > >                  rdmsr(cmd->addr.msr.reg, lo, hi);
> > > > >                  lo = (lo&  ~INTEL_MSR_RANGE) | (cmd->val&  INTEL_MSR_RANGE);
> > > > >                  wrmsr(cmd->addr.msr.reg, lo, hi);
> > > > > @@ -363,7 +364,6 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
> > > > >                  }
> > > > >          }
> > > > >
> > > > > -       trace_power_frequency(POWER_PSTATE, data->freq_table[next_state].frequency);
This is still wrong:
Before the frequency:
   data->freq_table[next_state].frequency
now the control field is traced. This is an arbitrary value which must be
written to the HW (IO or MSR), it's pure luck that in MSR case it seem to
be identical to the frequency (on this HW), but this needs not to be
the case.
   cmd.val = (u32) perf->states[next_perf_state].control


But something else...:
What exactly is the power tracer good for and what is it
capable of which cpufreq_stats is not capable to do?

Beside the fact that it is an ugly macro you cannot grep for,
acpi-cpufreq really seem to be the only place it gets used in
the whole kernel:
grep trace_power_frequency * -rl
arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c

Robert: If you want to get proper cpufreq tracing/statistics,
compile with:
CONFIG_CPU_FREQ_STAT=y
and do:
modprobe cpufreq_stats
cat /sys/devices/system/cpu/cpu*/cpufreq/stats/*

Below patch fixes the problem.
This time submitted on the right mailing list,
it looks like the trace_power_frequency stuff never hit
the cpufreq list, even the maintainer wasn't CC'ed on
any trace_power_frequency submission.

For the trace people: To do it right, you have to hook
your trace function into cpufreq_stats. You also have
to pass the cpu on which the frequency change happened.

---
cpufreq: Remove broken trace_power_frequency

cpufreq_stats is used for frequency statistics and supports *all*
frequency switching drivers/HW.

The trace_power_frequency interface:
  - only supports one cpufreq driver (acpi-cpufreq)
  - has no additional capabilities compared to cpufreq_stats
  - is broken and traces wrong CPUs on frequency switches
    (cmp. with mail thread:
    trace power_frequency events on the correct cpu
    on the cpufreq@...r.kernel.org list)

Signed-off-by: Thomas Renninger <trenn@...e.de>

diff --git a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
index 1b1920f..1808284 100644
--- a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
+++ b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
@@ -33,7 +33,6 @@
 #include <linux/cpufreq.h>
 #include <linux/compiler.h>
 #include <linux/dmi.h>
-#include <trace/events/power.h>
 
 #include <linux/acpi.h>
 #include <linux/io.h>
@@ -363,8 +362,6 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
 		}
 	}
 
-	trace_power_frequency(POWER_PSTATE, data->freq_table[next_state].frequency);
-
 	switch (data->cpu_feature) {
 	case SYSTEM_INTEL_MSR_CAPABLE:
 		cmd.type = SYSTEM_INTEL_MSR_CAPABLE;
diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index c4efe9b..82b2b99 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -42,13 +42,6 @@ DEFINE_EVENT(power, power_start,
 	TP_ARGS(type, state)
 );
 
-DEFINE_EVENT(power, power_frequency,
-
-	TP_PROTO(unsigned int type, unsigned int state),
-
-	TP_ARGS(type, state)
-);
-
 TRACE_EVENT(power_end,
 
 	TP_PROTO(int dummy),
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index 9f4f565..705d926 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -13,6 +13,3 @@
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/power.h>
-
-EXPORT_TRACEPOINT_SYMBOL_GPL(power_frequency);
-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ