lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070811135705.0d0688d7@hyperion.delvare>
Date:	Sat, 11 Aug 2007 13:57:05 +0200
From:	Jean Delvare <khali@...ux-fr.org>
To:	Stefan Richter <stefanr@...6.in-berlin.de>
Cc:	"Mark M. Hoffman" <mhoffman@...htlink.com>,
	lm-sensors@...sensors.org, linux-kernel@...r.kernel.org
Subject: Re: 2.6.23-rc1 regression: hwmon/w83627ehf: wrong fan speed

Hi Stefan,

On Sat, 11 Aug 2007 00:29:36 +0200, Stefan Richter wrote:
> Jean Delvare wrote:
> > I just tried 2.6.23-rc2 on a system where I use the w83627ehf hardware
> > monitoring driver, and was not able to reproduce the problem you
> > described. Fan speeds are reported properly for me. Which I kind of
> > expected, as I tested all my w83627ehf patches on this system before
> > submitting them.
> 
> Thanks that you are still after it.  I was busy with other stuff the
> whole week, hence no git bisect result from me yet.
> 
> > Please try using sensors instead of ksensors, and confirm that the
> > behavior is the same. I'd like to rule out a problem in ksensors
> > itself. sensors will also report the fan divs, this is a useful
> > information given the problem you have.
> 
> # sensors
> w83627ehf-isa-0290
> Adapter: ISA adapter
> VCore:     +0.95 V  (min =  +0.00 V, max =  +1.74 V)
> in1:      +12.30 V  (min =  +1.64 V, max =  +3.22 V) ALARM
> AVCC:      +3.28 V  (min =  +1.89 V, max =  +1.94 V) ALARM
> 3VCC:      +3.26 V  (min =  +0.18 V, max =  +0.72 V) ALARM
> in4:       +1.58 V  (min =  +0.57 V, max =  +0.90 V) ALARM
> in5:       +1.70 V  (min =  +0.41 V, max =  +1.19 V) ALARM
> in6:       +3.43 V  (min =  +0.31 V, max =  +3.05 V) ALARM
> VSB:       +3.25 V  (min =  +0.37 V, max =  +3.01 V) ALARM
> VBAT:      +3.18 V  (min =  +3.94 V, max =  +0.74 V) ALARM
> in9:       +1.88 V  (min =  +0.79 V, max =  +1.40 V) ALARM
> Case Fan:    0 RPM  (min =  753 RPM, div = 128) ALARM
> CPU Fan:    88 RPM  (min =  659 RPM, div = 64) ALARM
> Aux Fan:     0 RPM  (min = 10546 RPM, div = 128) ALARM
> fan5:        0 RPM  (min =  753 RPM, div = 128) ALARM
> Sys Temp:    +44 C  (high =    -5 C, hyst =   -34 C)   ALARM
> CPU Temp:  +38.0 C  (high = +80.0 C, hyst = +75.0 C)
> AUX Temp:  +43.5 C  (high = +80.0 C, hyst = +75.0 C)
> 
> coretemp-isa-0000
> Adapter: ISA adapter
> 
> coretemp-isa-0001
> Adapter: ISA adapter

Note: a more recent version of lm-sensors would give you additional
temperature values here.

> 
> (The aux fan and fan5 are not connected.)
> 
> > Your original post suggests that the fan speed is supposed to change
> > depending on the system load? Or temperature? Please describe the
> > mechanism used to achieve this. Could it be that this mechanism isn't
> > working properly, and the reported (low) speeds are actually true?
> 
> The motherboard controls the CPU fan and I believe also the case fan,
> probably based on temperatures.  (The manual is buried somewhere and
> MSI's download site is down right in this moment.)

I would like to know what it is doing exactly, and how. Can this
feature be disabled? If the BIOS accesses the W83627EHG chip in our
back, this can cause lots of trouble, such as what you are seeing.

> The low speeds or the dividers incorrect.  I'll reboot in a minute into
> 2.6.22(-rc5)  and post the "sensors" output.
> 
> > What fan inputs are used by your CPU and system fans? "sensors
> > -c /dev/null" will tell.
> 
> ...
> fan1:      484 RPM  (min = 12053 RPM, div = 16) ALARM
> fan2:       89 RPM  (min =  659 RPM, div = 64) ALARM
> fan3:        0 RPM  (min = 10546 RPM, div = 128) ALARM
> fan5:        0 RPM  (min = 1506 RPM, div = 128) ALARM
> ...

Note: 89 RPM with div=64 corresponds to the same register value as 1424
RPM with div=4 (as shown in your next post). As if the chip and the
driver didn't agree on what the actual divider is.

> Hmm, interesting.  When I now re-run sensors I get
> ...
> Case Fan:  484 RPM  (min = 12053 RPM, div = 16) ALARM
> CPU Fan:    89 RPM  (min =  659 RPM, div = 64) ALARM
> Aux Fan:     0 RPM  (min = 10546 RPM, div = 128) ALARM
> fan5:        0 RPM  (min = 1506 RPM, div = 128) ALARM
> ...
> 
> (I'm still in 2.6.23-rc2.  Ksensors picked the 484 RPM of the case fan
> up too, and that's most certainly the correct speed.  Just the CPU fan's
> speed is still wrong; or rather its divider should be 16 rather than 64.)

Divider should be 4,not 16, methinks.

You can dump the chip registers using the following command:
isadump 0x295 0x296
From now on, whenever you paste the output of "sensors", please dump
the contents of the chip too and include the output.

> > Other than that, I can only ask for the same things Mark already
> > suggested: compile with HWMON debugging and provide the logs (this will
> > show what fan div the driver is trying to select), and try bisecting
> > using git to find out which patch exactly caused the problem.
> 
> How comes the divider of one of the fans changed from one minute to the
> other?

No idea. The new driver can only increase, not decrease, the clock
divider when you poll for a speed value. So the change above (from 128
to 16) is not supposed to happen. However... I also can't explain why
the original reading is 0 (with div=128). A reading of 0 only happens
if the divider is too low (i.e. less than 16.) If the driver increased
the divider to 16, it means that it was previously 8, not 128.

Now, given how dividers are encoded:
128 -> 111b
 64 -> 110b
  8 -> 011b
  4 -> 010b

See the pattern? The case fan's clock divider reads 128 when it is 8,
the CPU fan's clock divider reads 64 when it is 4. In both cases, it is
the most significant bit that is wrong. And it happens that this bit is
in a separate register (VBAT, 0x5d), which happens to be in the banked
register range of the W83627EHG (0x50-0x5f).

So my theory is that something else (BIOS, ACPI?) is changing the bank,
probably to read temperature values which are in banks 1 and 2, causing
the w83627ehf to get a wrong value for the VBAT register. If I am
right, then the attached patch should help. Please give it a try and
report.

Mark: the previous version of the driver was initializing the fan mins.
This wasn't needed, however the bank was reset to 0 as a side effect
when initializing fan5_min. When removing the unneeded code, I caused
the initial bank value to become undefined. This explains in part the
odd behavior reported by Stefan. The fix is to either set the bank to 0
explicitly on driver load, or to stop assuming that bank is 0 by
default. My patch does the latter, as it might also help in case
something is later doing concurrent accesses to the chip.

> FWIW, the ``chip "w83627ehf-*"ยดยด section in Gentoo's /etc/sensors.conf
> provides only labels for fan{1,2,3}. It is titled
> # Winbond W83627EHF configuration originally contributed by Leon Moonen
> # This is for an Asus P5P800, voltages for A8V-E SE.

This is from the standard default sensors configuration file. It is
expected that you tweak the labels, limits etc. to match your own
motherboard.

> Should I hardwire correct dividers or pulse per rev in sensors.conf or
> is the driver supposed to work the correct dividers out --- like it did
> before 2.6.23-rc?

The driver is supposed to pick the best divider if you set the min fan
speed limits properly (which it seems you didn't.) If you don't set the
min limits, all the driver will do is increase the divider as long as
it gets a 0 reading, so the dividers will be good, but not necessarily
optimum.

Thanks,
-- 
Jean Delvare

View attachment "hwmon-w83627ehf-dont-assume-bank-0.patch" of type "text/x-patch" (1463 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ