[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a144a969-fa29-bc05-3daf-c6346dae644c@roeck-us.net>
Date: Thu, 15 Jun 2023 19:21:35 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Yazen Ghannam <yazen.ghannam@....com>, x86@...nel.org
Cc: linux-kernel@...r.kernel.org, platform-driver-x86@...r.kernel.org,
markgross@...nel.org, hdegoede@...hat.com,
Shyam-sundar.S-k@....com, linux-edac@...r.kernel.org,
clemens@...isch.de, jdelvare@...e.com, linux-hwmon@...r.kernel.org,
mario.limonciello@....com, babu.moger@....com
Subject: Re: [PATCH v2 3/6] hwmon: (k10temp) Check return value of
amd_smn_read()
On 6/15/23 09:03, Yazen Ghannam wrote:
> Check the return value of amd_smn_read() before saving a value. This
> ensures invalid values aren't saved or used.
>
> There are three cases here with slightly different behavior.
>
> 1) read_tempreg_nb_zen():
> This is a function pointer which does not include a return code.
> In this case, set the register value to 0 on failure. This
> enforces Read-as-Zero behavior.
>
> 2) k10temp_read_temp():
> This function does have return codes, so return the error code
> from the failed register read. Continued operation is not
> necessary, since there is no valid data from the register.
> Furthermore, if the register value was set to 0, then the
> following operation would underflow.
>
> 3) k10temp_get_ccd_support():
> This function reads the same register from multiple CCD
> instances in a loop. And a bitmask is formed if a specific bit
> is set in each register instance. The loop should continue on a
> failed register read, skipping the bit check.
>
> Furthermore, the __must_check attribute will be added to amd_smn_read().
> Therefore, this change is required to avoid compile-time warnings.
>
> Signed-off-by: Yazen Ghannam <yazen.ghannam@....com>
> Cc: stable@...r.kernel.org
Acked-by: Guenter Roeck <linux@...ck-us.net>
> ---
> Link:
> https://lore.kernel.org/r/20230516202430.4157216-4-yazen.ghannam@amd.com
>
> v1->v2:
> * Address comments from Guenter.
>
> drivers/hwmon/k10temp.c | 36 +++++++++++++++++++++++++++---------
> 1 file changed, 27 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c
> index 7b177b9fbb09..70f7b77e6ece 100644
> --- a/drivers/hwmon/k10temp.c
> +++ b/drivers/hwmon/k10temp.c
> @@ -145,8 +145,9 @@ static void read_tempreg_nb_f15(struct pci_dev *pdev, u32 *regval)
>
> static void read_tempreg_nb_zen(struct pci_dev *pdev, u32 *regval)
> {
> - amd_smn_read(amd_pci_dev_to_node_id(pdev),
> - ZEN_REPORTED_TEMP_CTRL_BASE, regval);
> + if (amd_smn_read(amd_pci_dev_to_node_id(pdev),
> + ZEN_REPORTED_TEMP_CTRL_BASE, regval))
> + *regval = 0;
> }
>
> static long get_raw_temp(struct k10temp_data *data)
> @@ -197,6 +198,7 @@ static int k10temp_read_temp(struct device *dev, u32 attr, int channel,
> long *val)
> {
> struct k10temp_data *data = dev_get_drvdata(dev);
> + int ret = -EOPNOTSUPP;
> u32 regval;
>
> switch (attr) {
> @@ -213,13 +215,17 @@ static int k10temp_read_temp(struct device *dev, u32 attr, int channel,
> *val = 0;
> break;
> case 2 ... 13: /* Tccd{1-12} */
> - amd_smn_read(amd_pci_dev_to_node_id(data->pdev),
> - ZEN_CCD_TEMP(data->ccd_offset, channel - 2),
> - ®val);
> + ret = amd_smn_read(amd_pci_dev_to_node_id(data->pdev),
> + ZEN_CCD_TEMP(data->ccd_offset, channel - 2),
> + ®val);
> +
> + if (ret)
> + return ret;
> +
> *val = (regval & ZEN_CCD_TEMP_MASK) * 125 - 49000;
> break;
> default:
> - return -EOPNOTSUPP;
> + return ret;
> }
> break;
> case hwmon_temp_max:
> @@ -235,7 +241,7 @@ static int k10temp_read_temp(struct device *dev, u32 attr, int channel,
> - ((regval >> 24) & 0xf)) * 500 + 52000;
> break;
> default:
> - return -EOPNOTSUPP;
> + return ret;
> }
> return 0;
> }
> @@ -373,8 +379,20 @@ static void k10temp_get_ccd_support(struct pci_dev *pdev,
> int i;
>
> for (i = 0; i < limit; i++) {
> - amd_smn_read(amd_pci_dev_to_node_id(pdev),
> - ZEN_CCD_TEMP(data->ccd_offset, i), ®val);
> + /*
> + * Ignore inaccessible CCDs.
> + *
> + * Some systems will return a register value of 0, and the TEMP_VALID
> + * bit check below will naturally fail.
> + *
> + * Other systems will return a PCI_ERROR_RESPONSE (0xFFFFFFFF) for
> + * the register value. And this will incorrectly pass the TEMP_VALID
> + * bit check.
> + */
> + if (amd_smn_read(amd_pci_dev_to_node_id(pdev),
> + ZEN_CCD_TEMP(data->ccd_offset, i), ®val))
> + continue;
> +
> if (regval & ZEN_CCD_TEMP_VALID)
> data->show_temp |= BIT(TCCD_BIT(i));
> }
Powered by blists - more mailing lists