lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <alpine.LFD.2.02.1103250330470.32565@x980>
Date:	Fri, 25 Mar 2011 04:12:03 -0400 (EDT)
From:	Len Brown <lenb@...nel.org>
To:	Trinabh Gupta <trinabh@...ux.vnet.ibm.com>
Cc:	Arjan van de Ven <arjan@...ux.intel.com>, peterz@...radead.org,
	suresh.b.siddha@...el.com, benh@...nel.crashing.org,
	venki@...gle.com, Andi Kleen <ak@...ux.intel.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH V1 1/2] cpuidle: Data structure changes for global
 cpuidle device

I agree it is silly to allocate a cpuidle_device
for every cpu in the system as we do today.

Yes, splitting the counters out of cpuidle_device
is a necessary part of fixing that.

However, cpuidle_device.cpuidle_state[] is currently not per-driver,
it is per-cpu, and it is writable.

In particular, the cpuidle_device->prepare() mechanism
causes updates to the cpuidle_state[].flags,
setting and clearing CPUIDLE_FLAG_IGNORE to
tell the governor not to chose a state
on a per-cpu basis at run-time.

I don't like that mechanism.
I'd like to see it replaced, and when replaced,
cpuidle_state[] can be per system-wide driver.

I think the real problem that prepare() was trying to solve
is that the driver today does not have the ability to over-rule
the choice made by the governor.  The driver may discover
in the course of trying to satisfy the request of the governor
that it needs to demote to a shallower state; or it may
do its best to satisfy the governor's request, and the hardware
may demote its request to a shallower state.

Unfortunately, when this happens, the driver dutifully
returns the time spent in the state to cpuidle_idle_call(),
who then updates the wrong last_residency, time, and usage counters.

Sure is ironic for the driver to allocate the data structures and
then hand the timer to the uppper layer, just to have the upper layer
update the wrong data structures...

Surely the driver enter routine should update the counters
that the driver was obligated to allocate, and it should return
the state actually entered (for tracing), rather than the time spent
there.

The generic cpuidle code should simply handle where the counters live
in the sysfs namespace, not updating the counters.

This needs to be addressed before cpuidle_device.cpuidle_state[]
can be made one/system.

cheers,
Len Brown, Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ