lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 20 May 2015 15:21:24 +0200
From:	Sascha Hauer <s.hauer@...gutronix.de>
To:	Mikko Perttunen <mikko.perttunen@...si.fi>
Cc:	Stephen Warren <swarren@...dotorg.org>, linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Eduardo Valentin <edubezval@...il.com>,
	linux-mediatek@...ts.infradead.org, kernel@...gutronix.de,
	Zhang Rui <rui.zhang@...el.com>,
	Brian Norris <computersforpeace@...il.com>,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 11/15] thermal: thermal: Add support for hardware-tracked
 trip points

On Tue, May 19, 2015 at 05:05:29PM +0300, Mikko Perttunen wrote:
> On 05/19/15 16:58, Sascha Hauer wrote:
> >On Mon, May 18, 2015 at 02:09:44PM +0200, Sascha Hauer wrote:
> >>Hi Mikko,
> >>
> >>On Mon, May 18, 2015 at 12:06:50PM +0300, Mikko Perttunen wrote:
> >>>>+	for (i = 0; i < tz->trips; i++) {
> >>>>+		int trip_low;
> >>>>+
> >>>>+		tz->ops->get_trip_temp(tz, i, &trip_temp);
> >>>>+		tz->ops->get_trip_hyst(tz, i, &hysteresis);
> >>>>+
> >>>>+		trip_low = trip_temp - hysteresis;
> >>>>+
> >>>>+		if (trip_low < temp && trip_low > low)
> >>>>+			low = trip_low;
> >>>>+
> >>>>+		if (trip_temp > temp && trip_temp < high)
> >>>>+			high = trip_temp;
> >>>>+	}
> >>>>+
> >>>>+	tz->prev_low_trip = low;
> >>>>+	tz->prev_high_trip = high;
> >>>>+
> >>>>+	dev_dbg(&tz->device, "new temperature boundaries: %d < x < %d\n",
> >>>>+			low, high);
> >>>>+
> >>>>+	tz->ops->set_trips(tz, low, high);
> >>>
> >>>This should probably do something if set_trips returns an error
> >>>code; at least an error message, perhaps enable polling? I'm not
> >>>exactly sure what safety features the thermal framework has in
> >>>general if errors happen..
> >>
> >>Currently a thermal zone has the passive_delay and polling_delay
> >>variables. If these are nonzero the thermal core will always poll. A
> >>purely interrupt driven thermal zone would set these values to zero.
> >>In this case the thermal core has no basis for polling, so we would
> >>have to make up polling intervals when set_trips fails. Another
> >>possibility would be to interpret the *_delay variables as 'when
> >>set_trips is available, do not poll. When something goes wrong, use
> >>*_delay as polling intervals'
> >>
> >>>
> >>>One interesting thing I noticed was that at least the bang-bang
> >>>governor only acts if the temperature is properly smaller than (trip
> >>>temp - hysteresis). So perhaps we should specify the non-tripping
> >>>range as [low, high)? Or we could change bang-bang.
> >>
> >>I wonder how we can protect against such off-by-one errors anyway.
> >>Generally a hardware might operate on raw values rather than directly
> >>in temperature values in °C. This means a driver for this must have
> >>celsius_to_raw and raw_to_celsius conversion functions. Now it can
> >>happen that due to rounding errors celsius_to_raw(Tcrit) returns a raw
> >>value that when converted back to celsius is different from the
> >>original value in °C. This would mean the hardware triggers an interrupt
> >>for a trip point and the thermal core does not react because get_temp
> >>actually returns a different temperature than previously programmed as
> >>interrupt trigger. This way we would lose hot (or cold) events.
> >
> >As a simple example we could imagine a 12bit adc which has:
> >
> >u32 mcelsius_to_raw(int temp)
> >{
> >	return temp / 30;
> >}
> >
> >int raw_to_mcelsius(u32 raw)
> >{
> >	return temp * 30;
> >}
> >
> >Now if the thermal framework requests an interrupt at 77000mC we
> >would program a raw value of 77000 / 30 = 2566.666667, due to integer
> >rounding we would program 2566. Now when the interrupt is triggered with
> >this exact raw value we would convert it back to 2566 * 30 = 76980. The
> >thermal framework would realize that this is below the threshold, do
> >nothing and go back to sleep.
> >I am beginning to think that implementing interrupts like this is not a
> >good idea, at least I found no convenient way out of this situation.
> 
> Couldn't you just specify that the driver should do the best it can?
> That is, in this case, the driver would program the hardware for the
> least possible value x for which raw_to_mcelsius(x) >= 77000.

That's what I did now.

Sascha


-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ