[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de8861e8-f21b-6a66-4f5b-25acc8ff40e2@roeck-us.net>
Date: Sun, 12 Jan 2020 12:08:09 -0800
From: Guenter Roeck <linux@...ck-us.net>
To: Gabriel C <nix.or.die@...il.com>
Cc: Linus Walleij <linus.walleij@...aro.org>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
linux-hwmon@...r.kernel.org, Jean Delvare <jdelvare@...e.com>,
Bart Van Assche <bvanassche@....org>,
Linux Doc Mailing List <linux-doc@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
linux-scsi <linux-scsi@...r.kernel.org>,
"open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers)"
<linux-ide@...r.kernel.org>, Chris Healy <cphealy@...il.com>
Subject: Re: [PATCH v2] hwmon: Driver for temperature sensors on SATA drives
On 1/12/20 10:37 AM, Gabriel C wrote:
> Am So., 12. Jan. 2020 um 16:26 Uhr schrieb Guenter Roeck <linux@...ck-us.net>:
>>
>> On 1/12/20 5:45 AM, Gabriel C wrote:
>>> Am So., 12. Jan. 2020 um 14:07 Uhr schrieb Guenter Roeck <linux@...ck-us.net>:
>>>>
>>>> On 1/12/20 4:07 AM, Linus Walleij wrote:
>>>>> On Sun, Jan 12, 2020 at 1:03 PM Gabriel C <nix.or.die@...il.com> wrote:
>>>>>> Am So., 12. Jan. 2020 um 12:22 Uhr schrieb Linus Walleij
>>>>>> <linus.walleij@...aro.org>:
>>>>>>>
>>>>>>> On Sun, Jan 12, 2020 at 12:18 PM Gabriel C <nix.or.die@...il.com> wrote:
>>>>>>>
>>>>>>>> What I've noticed however is the nvme temperature low/high values on
>>>>>>>> the Sensors X are strange here.
>>>>>>> (...)
>>>>>>>> Sensor 1: +27.9°C (low = -273.1°C, high = +65261.8°C)
>>>>>>>> Sensor 2: +29.9°C (low = -273.1°C, high = +65261.8°C)
>>>>>>> (...)
>>>>>>>> Sensor 1: +23.9°C (low = -273.1°C, high = +65261.8°C)
>>>>>>>> Sensor 2: +25.9°C (low = -273.1°C, high = +65261.8°C)
>>>>>>>
>>>>>>> That doesn't look strange to me. It seems like reasonable defaults
>>>>>>> from the firmware if either it doesn't really log the min/max temperatures
>>>>>>> or hasn't been through a cycle of updating these yet. Just set both
>>>>>>> to absolute min/max temperatures possible.
>>>>>>
>>>>>> Ok I'll check that.
>>>>>>
>>>>>> Do you mean by setting the temperatures to use a lmsensors config?
>>>>>> Or is there a way to set these with a nvme command?
>>>>>
>>>>> Not that I know of.
>>>>>
>>>>> The min/max are the minumum and maximum temperatures the
>>>>> device has experienced during this power-on cycle.
>>>>>
>>>>
>>>> No, that would be lowest/highest. The above are (or should be) per-sensor
>>>> setpoints. The default for those is typically the absolute minimum /
>>>> maximum of the supported range.
>>>>
>>>> Some SATA drives report the lowest/highest temperatures experienced
>>>> since power cycle, like here.
>>>>
>>>> drivetemp-scsi-5-0
>>>> Adapter: SCSI adapter
>>>> temp1: +23.0°C (low = +0.0°C, high = +60.0°C)
>>>> (crit low = -41.0°C, crit = +85.0°C)
>>>> (lowest = +20.0°C, highest = +31.0°C)
>>>>
>>>
>>> The SATA temperatures are fine and reported like this here too, just
>>> the nvme ones are strange.
>>>
>>> drivetemp-scsi-4-0
>>> Adapter: SCSI adapter
>>> temp1: +28.0°C (low = +1.0°C, high = +61.0°C)
>>> (crit low = +2.0°C, crit = +60.0°C)
>>> (lowest = +16.0°C, highest = +31.0°C)
>>>
>>> drivetemp-scsi-12-0
>>> Adapter: SCSI adapter
>>> temp1: +29.0°C (low = +1.0°C, high = +61.0°C)
>>> (crit low = +2.0°C, crit = +60.0°C)
>>> (lowest = +18.0°C, highest = +32.0°C)
>>>
>>> and so on.
>>>
>>> Btw, where I can find the code does these calculations?
>>>
>>
>> Not sure if that is what you are looking for, but the nvme hardware
>> monitoring driver is at drivers/nvme/host/hwmon.c, the SATA hardware
>> monitoring driver is at drivers/hwmon/drivetemp.c.
>>
>
> I have a look thanks.
>
> I'm using your v2 patch for the nvme part since you posted it on 5.4 kernels.
> This is probably why I find the way the temperatures are now reported
> very strange.
>
> The ADATA XPG SX8200 Pro in my laptop seems to work better:
>
> nvme-pci-0200
> Adapter: PCI adapter
> Composite: +37.9°C (low = -0.1°C, high = +74.8°C)
> (crit = +79.8°C)
>
> Low is 0° which is what the spec suggests.
>
>> The limits on nvme drives are configurable.
>
> Yes, I found this out already.
>
>> root@...ver:/sys/class/hwmon# sensors nvme-pci-0100
>> nvme-pci-0100
>> Adapter: PCI adapter
>> Composite: +40.9°C (low = -273.1°C, high = +84.8°C)
>> (crit = +84.8°C)
>> Sensor 1: +40.9°C (low = -273.1°C, high = +65261.8°C)
>> Sensor 2: +43.9°C (low = -273.1°C, high = +65261.8°C)
>>
>> root@...ver:/sys/class/hwmon# echo 0 > hwmon1/temp2_min
>> root@...ver:/sys/class/hwmon# echo 100000 > hwmon1/temp2_max
>
> An lm-sensors configuration will work too.
>
Sure, the above was just an example.
>> root@...ver:/sys/class/hwmon# sensors nvme-pci-0100
>> nvme-pci-0100
>> Adapter: PCI adapter
>> Composite: +38.9°C (low = -273.1°C, high = +84.8°C)
>> (crit = +84.8°C)
>> Sensor 1: +38.9°C (low = -0.1°C, high = +99.8°C)
>> Sensor 2: +42.9°C (low = -273.1°C, high = +65261.8°C)
>>
>> If you dislike the defaults, just configure whatever you think is
>> appropriate for your system.
>
> It's not about disliking the values. I want to find out if these Samsung models
> don't support that, or it is a bug somewhere in writing/calculating the values.
>
No, this is not a bug. It is perfectly valid for individual sensors to have
uninitialized limits. If I recall correctly, the NVME specification
specifically states that the default settings for individual sensors
shall be those values (0 and 65535 Kelvin, specifically).
And, yes, I would agree that is a bit odd that NVME drives report temperatures
in Kelvin, but such is the world.
> In the case, Samsung and others don't support such a thing wouldn't be
> better to just ignore
> the bogus reading altogether?
Again, you can set whatever limits you like. The default limits on many
hardware sensor chips have odd values. Just looking at my system:
nct6797-isa-0a20
Adapter: ISA adapter
in0: +0.48 V (min = +0.00 V, max = +1.74 V)
in1: +1.02 V (min = +0.00 V, max = +0.00 V) ALARM
in2: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM
in3: +3.31 V (min = +0.00 V, max = +0.00 V) ALARM
in4: +1.00 V (min = +0.00 V, max = +0.00 V) ALARM
in5: +0.14 V (min = +0.00 V, max = +0.00 V) ALARM
in6: +0.82 V (min = +0.00 V, max = +0.00 V) ALARM
in7: +3.38 V (min = +0.00 V, max = +0.00 V) ALARM
in8: +3.26 V (min = +0.00 V, max = +0.00 V) ALARM
in9: +1.82 V (min = +0.00 V, max = +0.00 V) ALARM
in10: +0.00 V (min = +0.00 V, max = +0.00 V)
in11: +0.74 V (min = +0.00 V, max = +0.00 V) ALARM
in12: +1.20 V (min = +0.00 V, max = +0.00 V) ALARM
in13: +0.68 V (min = +0.00 V, max = +0.00 V) ALARM
in14: +1.50 V (min = +0.00 V, max = +0.00 V) ALARM
Are you suggesting that we should not support setting min/max values for
all drivers just because they are often not initialized to reasonable values
by default ?
Guenter
Powered by blists - more mailing lists