lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200912021607.53237.trenn@suse.de>
Date:	Wed, 2 Dec 2009 16:07:52 +0100
From:	Thomas Renninger <trenn@...e.de>
To:	Christoph Hellwig <hch@....de>
Cc:	Henrique de Moraes Holschuh <hmh@....eng.br>,
	Zhang Rui <rui.zhang@...el.com>, linux-acpi@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: regression: 2.6.32-rc8 shuts down after reaching critical temperature

On Wednesday 02 December 2009 14:30:32 Christoph Hellwig wrote:
> On Wed, Dec 02, 2009 at 12:56:20PM +0100, Thomas Renninger wrote:
... 
> > 2.6.31 works?
> 
> Yes, perfectly.  Have been running it for a couple of days now again
> after I had all these reproducible .32-rc shutdowns when testiong it.
> 
> > Also the latest stable one?
> 
> Haven't tried that yet, will do if it helps you.
No need. Looks unrelated, the one system seem to overheat because of
no fan activity at all, yours seem to have a "passive cooling does not work
or kicks in too late" (and possibly also fan?) problem(s).

Best would be to open a bug on bugzilla.kernel.org and assign it to the
acpi component (and add Rui, Henrique and myself to CC. I won't be that
active, at least not the next days, just wanted to make sure whether
this isn't a duplicate).
dmesg, acpidump, grep . /proc/acpi/thermal_zone/*/*
and the shutdown messages should be most important info which
should show up there.

Some more hints you may want to try:

  - Does cpufreq work at all?
    Does this dir exist: /sys/devices/system/cpu/cpu*/cpufreq
    If temp of:
    watch -n1 cat /proc/acpi/thermal_zone/THM1/temperature
    goes beyond 96 C
    an ACPI processor event must get thrown and this:
    /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
    will get limited (lower than ../cpufreq/cpuinfo_max_freq).
    echo xy >/sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
    may be bad workaround.
    These boot params: thermal.psv=90 thermal.tzp=10
    lowering all passive trip points to 90 and enabling polling
    might be a better one (with which you might be able to better
    test passive cooling). This really should be a runtime sysfs
    per thermal_zone parameter, but this is another story...
    
  - Is the ACPI event thrown at all?:
    SUSE has acpi_listen, not sure whether it's part of the acpid
    mainline project, I think it is. Do you see an ACPI event when
    96 C is past?
    If not this might workaround your issue:
    echo 10 >/proc/acpi/thermal_zone/THM1/polling_frequency (or similar)

  - T500 sounds pretty new. Still, make sure your fans are clean.
    E.g. the air must be really hot coming out at some point of time.

  - Also listen a bit to the fans. with thinkpad-acpi driver you might
    be able to monitor (T500 is rather new/untested) the fans:
    cat /proc/acpi/ibm/fan  # path out of my mind
    You might also be able to alter the fan behavior there.

Good luck,

    Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ