lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c375ccf-509a-aea3-62a3-d4f43d6b6c8a@roeck-us.net>
Date:   Wed, 8 Feb 2023 07:16:52 -0800
From:   Guenter Roeck <linux@...ck-us.net>
To:     Hal Feng <hal.feng@...rfivetech.com>
Cc:     linux-hwmon@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-riscv@...ts.infradead.org, devicetree@...r.kernel.org,
        Jean Delvare <jdelvare@...e.com>,
        Jonathan Corbet <corbet@....net>,
        Conor Dooley <conor@...nel.org>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Albert Ou <aou@...s.berkeley.edu>,
        Rob Herring <robh+dt@...nel.org>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        Emil Renner Berthing <emil.renner.berthing@...onical.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 2/4] hwmon: (sfctemp) Add StarFive JH71x0 temperature
 sensor

On 2/8/23 04:40, Hal Feng wrote:
> On Mon, 6 Feb 2023 11:21:38 -0800, Guenter Roeck wrote:
>> On 2/6/23 09:12, Hal Feng wrote:
>>> On Tue, 3 Jan 2023 14:10:17 -0800, Guenter Roeck wrote:
>>>> On Tue, Jan 03, 2023 at 09:31:43AM +0800, Hal Feng wrote:
> [...]
>>>>> diff --git a/drivers/hwmon/sfctemp.c b/drivers/hwmon/sfctemp.c
>>>>> new file mode 100644
>>>>> index 000000000000..e56716ad9587
>>>>> --- /dev/null
>>>>> +++ b/drivers/hwmon/sfctemp.c
>>>>> @@ -0,0 +1,350 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>> +/*
>>>>> + * Copyright (C) 2021 Emil Renner Berthing <kernel@...il.dk>
>>>>> + * Copyright (C) 2021 Samin Guo <samin.guo@...rfivetech.com>
>>>>> + */
>>>>> +#include <linux/clk.h>
>>>>> +#include <linux/completion.h>
>>>>> +#include <linux/delay.h>
>>>>> +#include <linux/hwmon.h>
>>>>> +#include <linux/interrupt.h>
>>>>> +#include <linux/io.h>
>>>>> +#include <linux/module.h>
>>>>> +#include <linux/mutex.h>
>>>>> +#include <linux/of.h>
>>>>> +#include <linux/platform_device.h>
>>>>> +#include <linux/reset.h>
>>>>> +
>>>>> +/*
>>>>> + * TempSensor reset. The RSTN can be de-asserted once the analog core has
>>>>> + * powered up. Trst(min 100ns)
>>>>> + * 0:reset  1:de-assert
>>>>> + */
>>>>> +#define SFCTEMP_RSTN    BIT(0)
>>>>
>>>> Missing include of linux/bits.h
>>>
>>> Will add it. Thanks.
>>>
>>>>
>>>>> +
>>>>> +/*
>>>>> + * TempSensor analog core power down. The analog core will be powered up
>>>>> + * Tpu(min 50us) after PD is de-asserted. RSTN should be held low until the
>>>>> + * analog core is powered up.
>>>>> + * 0:power up  1:power down
>>>>> + */
>>>>> +#define SFCTEMP_PD    BIT(1)
>>>>> +
>>>>> +/*
>>>>> + * TempSensor start conversion enable.
>>>>> + * 0:disable  1:enable
>>>>> + */
>>>>> +#define SFCTEMP_RUN    BIT(2)
>>>>> +
>>>>> +/*
>>>>> + * TempSensor conversion value output.
>>>>> + * Temp(C)=DOUT*Y/4094 - K
>>>>> + */
>>>>> +#define SFCTEMP_DOUT_POS    16
>>>>> +#define SFCTEMP_DOUT_MSK    GENMASK(27, 16)
>>>>> +
>>>>> +/* DOUT to Celcius conversion constants */
>>>>> +#define SFCTEMP_Y1000    237500L
>>>>> +#define SFCTEMP_Z    4094L
>>>>> +#define SFCTEMP_K1000    81100L
>>>>> +
>>>>> +struct sfctemp {
>>>>> +    /* serialize access to hardware register and enabled below */
>>>>> +    struct mutex lock;
>>>>> +    struct completion conversion_done;
>>>>> +    void __iomem *regs;
>>>>> +    struct clk *clk_sense;
>>>>> +    struct clk *clk_bus;
>>>>> +    struct reset_control *rst_sense;
>>>>> +    struct reset_control *rst_bus;
>>>>> +    bool enabled;
>>>>> +};
>>>>> +
>>>>> +static irqreturn_t sfctemp_isr(int irq, void *data)
>>>>> +{
>>>>> +    struct sfctemp *sfctemp = data;
>>>>> +
>>>>> +    complete(&sfctemp->conversion_done);
>>>>> +    return IRQ_HANDLED;
>>>>> +}
>>>>> +
>>>>> +static void sfctemp_power_up(struct sfctemp *sfctemp)
>>>>> +{
>>>>> +    /* make sure we're powered down first */
>>>>> +    writel(SFCTEMP_PD, sfctemp->regs);
>>>>> +    udelay(1);
>>>>> +
>>>>> +    writel(0, sfctemp->regs);
>>>>> +    /* wait t_pu(50us) + t_rst(100ns) */
>>>>> +    usleep_range(60, 200);
>>>>> +
>>>>> +    /* de-assert reset */
>>>>> +    writel(SFCTEMP_RSTN, sfctemp->regs);
>>>>> +    udelay(1); /* wait t_su(500ps) */
>>>>> +}
>>>>> +
>>>>> +static void sfctemp_power_down(struct sfctemp *sfctemp)
>>>>> +{
>>>>> +    writel(SFCTEMP_PD, sfctemp->regs);
>>>>> +}
>>>>> +
>>>>> +static void sfctemp_run_single(struct sfctemp *sfctemp)
>>>>> +{
>>>>> +    writel(SFCTEMP_RSTN | SFCTEMP_RUN, sfctemp->regs);
>>>>> +    udelay(1);
>>>>> +    writel(SFCTEMP_RSTN, sfctemp->regs);
>>>>
>>>> The datasheet (or, rather, programming manual) does not appear
>>>> to be public, so I have to guess here.
>>>>
>>>> The code suggests that running a single conversion may be a choice,
>>>> not a requirement. If it is indeed a choice, the reasoning needs to be
>>>> explained since it adds a lot of complexity and dependencies to the
>>>> driver (for example, interrupt support is only mandatory or even needed
>>>> due to this choice). It also adds a significant delay to temperature
>>>> read operations, which may have practical impact on thermal control
>>>> software.
>>>>
>>>> If the chip only supports single temperature readings, that needs to be
>>>> explained as well (and why SFCTEMP_RUN has to be reset in that case).
>>>
>>> The chip supports continuous conversion. When you set SFCTEMP_RUN, the
>>> temperature raw data will be generated all the time. However, it will
>>> also generate interrupts all the time when the conversion is finished,
>>> because of the hardware limitation. So in this driver, we just support
>>> the single conversion.
>>>
>>
>> Sorry, I don't follow the logic. The interrupt is, for all practical
>> purposes, useless because there are no limits and exceeding any such
>> limits is therefore not supported. The only reason to have and enable
>> to interrupt is because continuous mode is disabled.
>>
>> The code could be simplified a lot if interrupt support would be
>> dropped and continuous mode would be enabled.
> 
> If we enable continuous mode, which means SFCTEMP_RUN remains asserted,
> the conversion finished interrupt will be raised after each sample
> time (8.192 ms). Within a few minutes, a lot of interrupts are raised,
> as showed below.
> 
> # cat /proc/interrupts
>             CPU0       CPU1       CPU2       CPU3
>    1:          0          0          0          0  SiFive PLIC   1 Edge      ccache_ecc
>    2:          1          0          0          0  SiFive PLIC   3 Edge      ccache_ecc
>    3:          1          0          0          0  SiFive PLIC   4 Edge      ccache_ecc
>    4:          0          0          0          0  SiFive PLIC   2 Edge      ccache_ecc
>    5:       1116       1670        411       1466  RISC-V INTC   5 Edge      riscv-timer
>    6:      32093          0          0          0  SiFive PLIC  81 Edge      120e0000.temperature-sensor
>   10:       1233          0          0          0  SiFive PLIC  32 Edge      ttyS0
> IPI0:       117         62        123        117  Rescheduling interrupts
> IPI1:       278        353        105        273  Function call interrupts
> IPI2:         0          0          0          0  CPU stop interrupts
> IPI3:         0          0          0          0  CPU stop (for crash dump) interrupts
> IPI4:         0          0          0          0  IRQ work interrupts
> IPI5:         0          0          0          0  Timer broadcast interrupts
> 
> If we enable continuous mode and drop the interrupt support in the
> driver, the kernel will not know the interrupts but a lot of interrupts
> are still raised in hardware. Can we do such like that?

Why not ? It just stays raised. That happens a lot.

> Without the interrupt support, the temperature we read may be the value
> generated in the last cycle.

That would be highly unusual and should be documented.


> 
> I think the temperature has its value only when we read it, so we start

"may be" ? "I think" ? That means you don't know ? Maybe test it, or ask
the chip designers.

> conversion only when we read the temperature. Further more, it will
> consume more power if we enable continuous mode.
> 

Usually that is not a concern, much less so than delaying each reader.

Ultimately, sure, you can do whatever you want. I'll still accept the driver.
I do expect you to explain your reasons (all of them) in the driver, though.

If you don't _know_ if the temperature is updated in continuous mode,
please state exactly that in the comments. Also explain how much power
is saved by not running in continuous mode. I don't want anyone to come
back later on and change the code because they don't know the reasons
why it doesn't use continuous mode.

Thanks,
Guenter

> Best regards,
> Hal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ