lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 29 Nov 2006 14:18:53 -0800
From:	Stephane Eranian <eranian@....hp.com>
To:	Andrew Morton <akpm@...l.org>
Cc:	Christoph Hellwig <hch@...radead.org>,
	linux-kernel@...r.kernel.org, ak@...e.de,
	Stephane Eranian <eranian@....hp.com>
Subject: Re: [PATCH] i386 add idle notifier

Hello,

On Wed, Nov 29, 2006 at 01:09:44PM -0800, Andrew Morton wrote:
> On Wed, 29 Nov 2006 17:09:39 +0000
> Christoph Hellwig <hch@...radead.org> wrote:
> 
> > On Wed, Nov 29, 2006 at 08:25:40AM -0800, Stephane Eranian wrote:
> > > Hello,
> > > 
> > > Here is a patch that adds an idle notifier to the i386 tree.
> > > The idle notifier functionalities and implementation are
> > > identical to the x86_64 idle notifier. We use the idle notifier
> > > in the context of perfmon.
> > > 
> > > The patch is against Andi Kleen's x86_64-2.6.19-rc6-061128-1.bz2
> > > kernel. It may apply to other kernels but it needs some updates
> > > to poll_idle() and default_idle() to work correctly.
> > 
> > Walking through a notifier chain on every single interrupt (including
> > timer interrupts) seems rather costly.  What do you need this for
> > exactly?
> 
Let me give you the background on this.

In system-wide mode, perfmon wants to exclude useless kernel execution
from being monitored. That execution is performed by the idle thread
when it enters its lowest loop level, i.e., poll_idle(), defaut_idle(),
mwait_idle(). It used to be that the idle loop was simply looping, waiting
for an interrupt or polling a variable. These days, it is a bit more
complex because on many processors, idle means going to a lower power state.
We want to capture useful idle thread execution such as interrupt servicing
but we want to exclude polling and low-power state.

When entering low-power, some processors systematically turn off
performance counters. But this is not necessarily the case for every processors.
Worse, for some processors it depends on the events being measured. Yet, at the
perfmon interface level, we want to guarantee a uniform behavior across architectures.
As such, we have to use software to enforce our policy.

Andi pointed out that there is an idle notifier in the x86-64 tree, and that it
could be a way to achieve our goal. 

I would tend to agree with both of you that the way the notifier is implemented may be a little
bit overkill for what we need it for. In my latest patch, I made sure that we register/unregister
an idle notification routine only when necessary. Yet, it is true that on each interrupt received
by the idle thread you go through enter_idle()/exit_idle() and atomic_call_chain().

In the current implementation, registration is global, call are obviously made for one CPU and
a RCU read lock is used, so multiple calls from different CPUs can be issued at the same time.

AFAICT, perfmon is the only user of the idle notifier. Perfmon operates on a per-cpu basis in system
wide. As such, I think we could simplify it by making the registration/call mechanism completly a
per-CPU construct. This would simplify the locking aspect. We would still need the test_and_set
to avoid a race with interrupts.

> yes, it's a worry.
> 
> Why doesn't enter_idle() do the test_and_set_bit() thing, like
> exit_idle()?

I think we could use the same test_and_set_bit(). In fact, there is  only
one place where we call enter_idle(), so just like x86-64 we could just
replace this with
	__get_cpu_var(idle_state) = 1;

-- 
-Stephane
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ