lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090520172114.GA32078@dirshya.in.ibm.com>
Date:	Wed, 20 May 2009 22:51:14 +0530
From:	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>
To:	Len Brown <lenb@...nel.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Shaohua Li <shaohua.li@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	"menage@...gle.com" <menage@...gle.com>
Subject: Re: [PATCH]cpuset: add new API to change cpuset top group's cpus

* Len Brown <lenb@...nel.org> [2009-05-19 15:01:46]:

> > ... the point is, we
> > don't need a new interface to force a cpu idle. Hotplug does that.
> >
> > Furthermore, we should not want anything outside of that, either the cpu
> > is there available for work, or its not -- halfway measures don't make
> > sense.
> > 
> > Furthermore, we already have power aware scheduling which tries to
> > aggregate idle time on cpu/core/packages so as to maximize the idle time
> > power savings. Use it there.
> 
> Some context...
> 
> In the past, server room power and thermal issues were handled
> either by spending too much money to provision power and
> thermals for theoretical worst case, or by abruptly shutting off
> servers when hard limits were reached.
> 
> Going forward, platforms are getting smarter, measuring how
> much power is drawn from the power supply, measuring the room
> thermals etc. so that real dollars can be saved by deploying
> systems that exceed the theoretical worst case if the power
> and thermal limits are enforced.
> 
> So if server approaches a budget, the platform
> will notify the OS to limit its P-states, and limit its T-states
> in order to draw less power.
> 
> If that is not sufficient, the platform will ask us to take
> processors off-line.  These are not processors that are otherwise idle
> -- those are already saving as much power as they can --
> these are processors that are fully utilized.
> 
> So power-aware scheduling is moot here, this isn't the
> partially idle case, this is the fully utilized case.

Hi Len,

Over and above power-aware scheduling we have been exploring
possibility of forcefully idle cpu for power savings.  This is mostly
useful in thermal case that you have mentioned and also to provide
fine grain power vs performance trade-offs.  Creating idle times and
consolidating idle time efficiently in order to evacuate cores and
packages provides a framework to exploit C-States apart from P-States
and T-States that you have mentioned above.  Addition of C-States
control to save power and heat may make the system do more
instructions at a given power/thermal constraint.

Reference: http://lkml.org/lkml/2009/5/13/173
 
> If power draw continues to be too high, the platform
> will simply ask us to take more processors off line.
> 
> If this dance doesn't reduce power below that required,
> the platform will be shut off.
> 
> So it is sufficient to simply not schedule cpu burners
> on the 'idled' processor.  Interrupts should generally
> not matter -- and if they do, we'll end up simply idling
> an additional processor.

The requirements and use cases are clear.

> > > > Besides, a hot removed cpu will do a dead loop halt, which isn't power saving
> > > > efficient. To make hot removed cpu enters deep C-state is in whish list for a
> > > > long time, but still not available. The acpi_processor_idle is a module, and
> > > > cpuidle governor potentially can't handle offline cpu.
> > > 
> > > Then fix that hot-unplug idle loop. I agree that the hlt thing is silly,
> > > and I've no idea why its still there, seems like a much better candidate
> > > for your efforts than this.
> 
> CONFIG_HOTPLUG_CPU has been problematic in the past.
> It does more than what we need here, so we thought
> a lighter-weight and lower-latency method that simply
> didn't schedule to the idled cpu would suffice.
> 
> Personally, I don't think that CONFIG_HOTPLUG_CPU should exist,
> taking processors on and off-line should be part of CONFIG_SMP.
> 
> A while back when I selected CONFIG_HOTPLUG_CPU from ACPI && SMP,
> there was a torrent of outrage that it infringed on user's right's
> to save that additional 18KB of memory that CONFIG_HOTPLUG_CPU
> includes that SMP does not...
> 
> We are fixing the hotplug-unplug idle loop, but there
> turns out to be some issues with it related to idle
> processors with interrupts disabled that don't actually
> get down into the deep C-states we request:-(

Fixing the hot-unplug idle loop will help us use the cpu-hotplug
infrastructure for many other purposes like power/thermal management
purposes. Do you think there could be some workaround/solution for
this in short term?

> So this is why you see a patch for a "halfway measure",
> it does what is necessary, and does nothing more.

Peter had detailed comments on this aspect.

--Vaidy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ