lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6824f030-92da-4439-af3b-8c2498f4382e@roeck-us.net>
Date: Tue, 28 May 2024 16:29:58 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Thomas Weißschuh <linux@...ssschuh.net>
Cc: Stephen Horvath <s.horvath@...look.com.au>,
 Jean Delvare <jdelvare@...e.com>, Benson Leung <bleung@...omium.org>,
 Lee Jones <lee@...nel.org>, Guenter Roeck <groeck@...omium.org>,
 linux-kernel@...r.kernel.org, linux-hwmon@...r.kernel.org,
 chrome-platform@...ts.linux.dev, Dustin Howett <dustin@...ett.net>,
 Mario Limonciello <mario.limonciello@....com>,
 Moritz Fischer <mdf@...nel.org>
Subject: Re: [PATCH v2 1/2] hwmon: add ChromeOS EC driver

On 5/28/24 09:15, Thomas Weißschuh wrote:
> On 2024-05-28 08:50:49+0000, Guenter Roeck wrote:
>> On 5/27/24 17:15, Stephen Horvath wrote:
>>> On 28/5/24 05:24, Thomas Weißschuh wrote:
>>>> On 2024-05-25 09:13:09+0000, Stephen Horvath wrote:
>>>>> I was the one to implement fan monitoring/control into Dustin's driver, and
>>>>> just had a quick comment for your driver:
>>>>>
>>>>> On 8/5/24 02:29, Thomas Weißschuh wrote:
>>>>>> The ChromeOS Embedded Controller exposes fan speed and temperature
>>>>>> readings.
>>>>>> Expose this data through the hwmon subsystem.
>>>>>>
>>>>>> The driver is designed to be probed via the cros_ec mfd device.
>>>>>>
>>>>>> Signed-off-by: Thomas Weißschuh <linux@...ssschuh.net>
>>>>>> ---
>>>>>>     Documentation/hwmon/cros_ec_hwmon.rst |  26 ++++
>>>>>>     Documentation/hwmon/index.rst         |   1 +
>>>>>>     MAINTAINERS                           |   8 +
>>>>>>     drivers/hwmon/Kconfig                 |  11 ++
>>>>>>     drivers/hwmon/Makefile                |   1 +
>>>>>>     drivers/hwmon/cros_ec_hwmon.c         | 269 ++++++++++++++++++++++++++++++++++
>>>>>>     6 files changed, 316 insertions(+)
>>>>>>
>>>>
>>>> <snip>
>>>>
>>>>>> diff --git a/drivers/hwmon/cros_ec_hwmon.c b/drivers/hwmon/cros_ec_hwmon.c
>>>>>> new file mode 100644
>>>>>> index 000000000000..d59d39df2ac4
>>>>>> --- /dev/null
>>>>>> +++ b/drivers/hwmon/cros_ec_hwmon.c
>>>>>> @@ -0,0 +1,269 @@
>>>>>> +// SPDX-License-Identifier: GPL-2.0-or-later
>>>>>> +/*
>>>>>> + *  ChromesOS EC driver for hwmon
>>>>>> + *
>>>>>> + *  Copyright (C) 2024 Thomas Weißschuh <linux@...ssschuh.net>
>>>>>> + */
>>>>>> +
>>>>>> +#include <linux/device.h>
>>>>>> +#include <linux/hwmon.h>
>>>>>> +#include <linux/kernel.h>
>>>>>> +#include <linux/mod_devicetable.h>
>>>>>> +#include <linux/module.h>
>>>>>> +#include <linux/platform_device.h>
>>>>>> +#include <linux/platform_data/cros_ec_commands.h>
>>>>>> +#include <linux/platform_data/cros_ec_proto.h>
>>>>>> +#include <linux/units.h>
>>>>>> +
>>>>>> +#define DRV_NAME    "cros-ec-hwmon"
>>>>>> +
>>>>>> +struct cros_ec_hwmon_priv {
>>>>>> +    struct cros_ec_device *cros_ec;
>>>>>> +    u8 thermal_version;
>>>>>> +    const char *temp_sensor_names[EC_TEMP_SENSOR_ENTRIES + EC_TEMP_SENSOR_B_ENTRIES];
>>>>>> +};
>>>>>> +
>>>>>> +static int cros_ec_hwmon_read_fan_speed(struct cros_ec_device *cros_ec, u8 index, u16 *speed)
>>>>>> +{
>>>>>> +    u16 data;
>>>>>> +    int ret;
>>>>>> +
>>>>>> +    ret = cros_ec->cmd_readmem(cros_ec, EC_MEMMAP_FAN + index * 2, 2, &data);
>>>>>> +    if (ret < 0)
>>>>>> +        return ret;
>>>>>> +
>>>>>> +    data = le16_to_cpu(data);
>>>>>> +
>>>>>> +    if (data == EC_FAN_SPEED_NOT_PRESENT)
>>>>>> +        return -ENODEV;
>>>>>> +
>>>>>
>>>>> Don't forget it can also return `EC_FAN_SPEED_STALLED`.
>>>>
>>>> Thanks for the hint. I'll need to think about how to handle this better.
>>>>
>>>>> Like Guenter, I also don't like returning `-ENODEV`, but I don't have a
>>>>> problem with checking for `EC_FAN_SPEED_NOT_PRESENT` in case it was removed
>>>>> since init or something.
>>>>
>>
>> That won't happen. Chromebooks are not servers, where one might be able to
>> replace a fan tray while the system is running.
> 
> In one of my testruns this actually happened.
> When running on battery, one specific of the CPU sensors sporadically
> returned EC_FAN_SPEED_NOT_PRESENT.
> 

What Chromebook was that ? I can't see the code path in the EC source
that would get me there.

>>>> Ok.
>>>>
>>>>> My approach was to return the speed as `0`, since the fan probably isn't
>>>>> spinning, but set HWMON_F_FAULT for `EC_FAN_SPEED_NOT_PRESENT` and
>>>>> HWMON_F_ALARM for `EC_FAN_SPEED_STALLED`.
>>>>> No idea if this is correct though.
>>>>
>>>> I'm not a fan of returning a speed of 0 in case of errors.
>>>> Rather -EIO which can't be mistaken.
>>>> Maybe -EIO for both EC_FAN_SPEED_NOT_PRESENT (which should never happen)
>>>> and also for EC_FAN_SPEED_STALLED.
>>>
>>> Yeah, that's pretty reasonable.
>>>
>>
>> -EIO is an i/o error. I have trouble reconciling that with
>> EC_FAN_SPEED_NOT_PRESENT or EC_FAN_SPEED_STALLED.
>>
>> Looking into the EC source code [1], I see:
>>
>> EC_FAN_SPEED_NOT_PRESENT means that the fan is not present.
>> That should return -ENODEV in the above code, but only for
>> the purpose of making the attribute invisible.
>>
>> EC_FAN_SPEED_STALLED means exactly that, i.e., that the fan
>> is present but not turning. The EC code does not expect that
>> to happen and generates a thermal event in case it does.
>> Given that, it does make sense to set the fault flag.
>> The actual fan speed value should then be reported as 0 or
>> possibly -ENODATA. It should _not_ generate any other error
>> because that would trip up the "sensors" command for no
>> good reason.
> 
> Ack.
> 
> Currently I have the following logic (for both fans and temp):
> 
> if NOT_PRESENT during probing:
>    make the attribute invisible.
> 
> if any error during runtime (including NOT_PRESENT):
>    return -ENODATA and a FAULT
> 
> This should also handle the sporadic NOT_PRESENT failures.
> 
> What do you think?
> 
> Is there any other feedback to this revision or should I send the next?
> 

No, except I'd really like to know which Chromebook randomly generates
a EC_FAN_SPEED_NOT_PRESENT response because that really looks like a bug.
Also, can you reproduce the problem with the ectool command ?

Thanks,
Guenter


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ