lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 24 Sep 2019 16:30:48 +0200 From: Holger Hoffstätte <holger@...lied-asynchrony.com> To: Netdev <netdev@...r.kernel.org>, Igor Russkikh <igor.russkikh@...antia.com> Subject: Re: atlantic: weird hwmon temperature readings with AQC107 NIC (kernel 5.2/5.3) On 9/24/19 4:16 PM, Holger Hoffstätte wrote: > Hi, > > I recently upgraded my home network with two AQ107-based NICs and a > multi-speed switch. Everything works great, but I couldn't help but notice > very weird hwmon temperature output (which I wanted to use for monitoring > and alerting). > > Both cards identify as: > > $lspci -v -s 06:00.0 > 06:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02) > Subsystem: ASUSTeK Computer Inc. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] > > In one machine lm_sensors says: > > eth0-pci-0200 > Adapter: PCI adapter > PHY Temperature: +315.1°C > > This seems quite wrong since the card is only slightly warm to the touch, and > 315.1 is exactly 255 + 60.1 - the latter value feels more like the actual > temperature. > > On a second machine it says: > > eth0-pci-0600 > Adapter: PCI adapter > PHY Temperature: +6977.0°C > > I feel qualified to say that is definitely wrong as well, since the machine is > currently not melting its way to the earth's core, and also only slightly warm > to the touch. :) > > Both cards also reported wrong values with kernel 5.2, but since I'm on 5.3.1 > I might as well report the current wrongness. > > Do we know who's to blame here - motherboards, NICs, driver, kernel, hwmon > infrastructure? I believe the hwmon patches landed first in 5.2. Another observation: the hwmon output immediately becomes sane (~58°) when I down the link with ifconfig. As soon as I bring the link back up, the temperature jumps from 58° to 6976° in one second. It seems that the presence of the carrier somehow mangles the sensor readings. I hope this helps to find the issue. thanks, Holger
Powered by blists - more mailing lists