[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1351097507.23327.78.camel@hornet>
Date: Wed, 24 Oct 2012 17:51:47 +0100
From: Pawel Moll <pawel.moll@....com>
To: Thomas Renninger <trenn@...e.de>
Cc: Amit Daniel Kachhap <amit.kachhap@...aro.org>,
Zhang Rui <rui.zhang@...el.com>,
Viresh Kumar <viresh.kumar@...aro.org>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Jean Delvare <khali@...ux-fr.org>,
Guenter Roeck <linux@...ck-us.net>,
Steven Rostedt <rostedt@...dmis.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Ingo Molnar <mingo@...e.hu>, Jesper Juhl <jj@...osbits.net>,
Jean Pihet <jean.pihet@...oldbits.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"lm-sensors@...sensors.org" <lm-sensors@...sensors.org>,
"linaro-dev@...ts.linaro.org" <linaro-dev@...ts.linaro.org>
Subject: Re: [RFC] Energy/power monitoring within the kernel
On Wed, 2012-10-24 at 01:40 +0100, Thomas Renninger wrote:
> > More and more of people are getting interested in the subject of power
> > (energy) consumption monitoring. We have some external tools like
> > "battery simulators", energy probes etc., but some targets can measure
> > their power usage on their own.
> >
> > Traditionally such data should be exposed to the user via hwmon sysfs
> > interface, and that's exactly what I did for "my" platform - I have
> > a /sys/class/hwmon/hwmon*/device/energy*_input and this was good
> > enough to draw pretty graphs in userspace. Everyone was happy...
> >
> > Now I am getting new requests to do more with this data. In particular
> > I'm asked how to add such information to ftrace/perf output.
> Why? What is the gain?
>
> Perf events can be triggered at any point in the kernel.
> A cpufreq event is triggered when the frequency gets changed.
> CPU idle events are triggered when the kernel requests to enter an idle state
> or exits one.
>
> When would you trigger a thermal or a power event?
> There is the possibility of (critical) thermal limits.
> But if I understand this correctly you want this for debugging and
> I guess you have everything interesting one can do with temperature
> values:
> - read the temperature
> - draw some nice graphs from the results
>
> Hm, I guess I know what you want to do:
> In your temperature/energy graph, you want to have some dots
> when relevant HW states (frequency, sleep states, DDR power,...)
> changed. Then you are able to see the effects over a timeline.
>
> So you have to bring the existing frequency/idle perf events together
> with temperature readings
>
> Cleanest solution could be to enhance the exisiting userspace apps
> (pytimechart/perf timechart) and let them add another line
> (temperature/energy), but the data would not come from perf, but
> from sysfs/hwmon.
> Not sure whether this works out with the timechart tools.
> Anyway, this sounds like a userspace only problem.
Ok, so it is actually what I'm working on right now. Not with the
standard perf tool (there are other users of that API ;-) but indeed I'm
trying to "enrich" the data stream coming from kernel with user-space
originating values. I am a little bit concerned about effect of extra
syscalls (accessing the value and gettimeofday to generate a timestamp)
at a higher sampling rates, but most likely it won't be a problem. Can
report once I know more, if this is of interest to anyone.
Anyway, there are at least two debug/trace related use cases that can
not be satisfied that way (of course one could argue about their
usefulness):
1. ftrace-over-network (https://lwn.net/Articles/410200/) which is
particularly appealing for "embedded users", where there's virtually no
useful userspace available (think Android). Here a (functional) trace
event is embedded into a normal trace and available "for free" at the
host side.
2. perf groups - the general idea is that one event (let it be cycle
counter interrupt or even a timer) triggers read of other values (eg.
cache counter or - in this case - energy counter). The aim is to have a
regular "snapshots" of the system state. I'm not sure if the standard
perf tool can do this, but I do :-)
And last, but not least, there are the non-debug/trace clients for
energy data as discussed in other mails in this thread. Of course the
trace event won't really satisfy their needs either.
Thanks for your feedback!
Paweł
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists