lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150518184433.GS11598@ld-irv-0074>
Date:	Mon, 18 May 2015 11:44:33 -0700
From:	Brian Norris <computersforpeace@...il.com>
To:	Sascha Hauer <s.hauer@...gutronix.de>
Cc:	Mikko Perttunen <mikko.perttunen@...si.fi>,
	linux-pm@...r.kernel.org, Zhang Rui <rui.zhang@...el.com>,
	Eduardo Valentin <edubezval@...il.com>,
	linux-kernel@...r.kernel.org,
	Stephen Warren <swarren@...dotorg.org>, kernel@...gutronix.de,
	linux-mediatek@...ts.infradead.org,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 11/15] thermal: thermal: Add support for hardware-tracked
 trip points

On Mon, May 18, 2015 at 02:09:44PM +0200, Sascha Hauer wrote:
> On Mon, May 18, 2015 at 12:06:50PM +0300, Mikko Perttunen wrote:
> > One interesting thing I noticed was that at least the bang-bang
> > governor only acts if the temperature is properly smaller than (trip
> > temp - hysteresis). So perhaps we should specify the non-tripping
> > range as [low, high)? Or we could change bang-bang.
> 
> I wonder how we can protect against such off-by-one errors anyway.
> Generally a hardware might operate on raw values rather than directly
> in temperature values in °C. This means a driver for this must have
> celsius_to_raw and raw_to_celsius conversion functions. Now it can
> happen that due to rounding errors celsius_to_raw(Tcrit) returns a raw
> value that when converted back to celsius is different from the
> original value in °C. This would mean the hardware triggers an interrupt
> for a trip point and the thermal core does not react because get_temp
> actually returns a different temperature than previously programmed as
> interrupt trigger. This way we would lose hot (or cold) events.

This also highlights another fact: there's a race between interrupt
generation and temperature reading (->get_temp()). I would expect any
hardware interrupt thermal sensor would also have a latched temperature
reading to correspond with it, and there would be no guarantee that this
latched temperature will match the polled reading seen once you reach
thermal_zone_device_update(). So a hardware driver might report a
thermal update, but the temperature reported to the core won't
necessarily match what interrupt was meant for.

I have a patch that adds a thermal_zone_device_update_temp() API, so
drivers can report the temperature along with the interrupt
notification. (Such a patch also helps so that the driver can choose to
round down on cold events and up on hot events, resolving your rounding
issue too.)

Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ