lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 24 Jul 2014 09:26:09 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	"Rafael J. Wysocki" <rjw@...ysocki.net>
Cc:	Morten Rasmussen <morten.rasmussen@....com>,
	linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
	mingo@...nel.org, vincent.guittot@...aro.org,
	daniel.lezcano@...aro.org, preeti@...ux.vnet.ibm.com,
	Dietmar.Eggemann@....com, pjt@...gle.com
Subject: Re: [RFCv2 PATCH 01/23] sched: Documentation for scheduler energy
 cost model

On Thu, Jul 24, 2014 at 02:53:20AM +0200, Rafael J. Wysocki wrote:
> I am used to slightly different terminology here.  Namely, there are voltage
> domains (parts sharing a voltage rail or a voltage regulator, such that you
> can only apply/remove/change voltage to all of them at the same time) and clock
> domains (analogously, but for clocks).  A power domain (which in your description
> above seems to correspond to a voltage domain) may be a voltage domain, a clock
> domain or a combination thereof.
> 
> In addition to that, in a voltage domain it may be possible to apply many
> different levels of voltage, which case doesn't seem to be covered at all by
> the above (or I'm missing something).
> 
> Also a P-state is not just a frequency level, but a combination of frequency
> and voltage that has to be applied for that frequency to be stable.  You may
> regard them as Operation Performance Points of the CPU, but that very well may
> go beyond frequencies and voltages.  Thus it actually is better not to talk
> about P-states as "frequencies".
> 
> Now, P-states may or may not have to be coordinated between all CPUs in a
> package (cluster), by hardware or software, such that all CPUs in a cluster
> need to be kept in the same P-state.  That you can regard as a "P-state
> domain", but it usually means a specific combination of voltage and frequency.

I think Morton is aware of this, but for the sake of sanity dropped the
whole lot into something simpler (while hoping reality would not ruin
his life).

> C-states in turn are states in which CPUs don't execute instructions.
> That need not mean the removal of voltage or even frequency from them.
> Of course, they do mean some sort of power draw reduction, but that may
> be achieved in many different ways.  Some C-states require coordination
> too (for example, a single C-state may apply to a whole package or cluster
> at the same time) and you can think about "domains" here too, but there
> need not be a direct mapping to physical parameters such as the frequency
> or the voltage.

One thing that wasn't clear to me is if you allow for C-domain and
P-domain to overlap or if they're always inclusive (where one is wholly
contained in the other).

> Moreover, P-states and C-states may overlap.  That is, a CPU may be in Px
> and Cy at the same time, which means that after leaving Cy it will execute
> instructions in Px.  Things like leakage may depend on x in that case and
> the total power draw may depend on the combination of x and y.

Right, and I suppose the domain thing makes it impossible to drop to the
lowest P state on going idle. Tricky that.

> The concern is that if a scaling governor is running in parallel with the above
> algorithm and it has its own utilization goal (it usually does), it may change
> the P-state under you to match that utilization goal and you'll end up with
> something different from what you expected.
> 
> That may be addressed either by trying to predict what the scaling governor will
> do (and good luck with that) or by taking care of P-states by yourself.  The
> latter would require changes to the algorithm I think, though.

The idea was that we'll do P states ourselves based on these utilization
figures. If we find we cannot fit the 'new' task into the current set
without either raising P or waking an idle cpu (if at all available), we
compute the cost of either option and pick the cheapest.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists