lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1a107aaf-c1c1-46af-98a8-03eae1bb9db2@gmx.de>
Date: Mon, 12 Jan 2026 18:41:45 +0100
From: Armin Wolf <W_Armin@....de>
To: TINSAE TADESSE <tinsaetadesse2015@...il.com>
Cc: linux@...ck-us.net, linux-hwmon@...r.kernel.org,
 linux-kernel@...r.kernel.org, bhelgaas@...gle.com, linux-pci@...r.kernel.org
Subject: Re: [PATCH 1/3] hwmon: spd5118: Do not fail resume on temporary I2C
 errors

Am 12.01.26 um 12:48 schrieb TINSAE TADESSE:

> On Sun, Jan 11, 2026 at 1:27 AM Armin Wolf <W_Armin@....de> wrote:
>> Am 10.01.26 um 18:19 schrieb Tinsae Tadesse:
>>
>>> SPD5118 DDR5 temperature sensors may be temporarily unavailable
>>> during s2idle resume. Ignore temporary -ENXIO and -EIO errors during resume and allow
>>> register synchronization to be retried later.
>> Hi,
>>
>> do you know if the error is caused by the SPD5118 device itself or by the underlying
>> i2c controller? Please also share the output of "acpidump" and the name of the i2c
>> controller used to communicate with the SPD5118.
>>
>>> Signed-off-by: Tinsae Tadesse <tinsaetadesse2015@...il.com>
>>> ---
>>>    drivers/hwmon/spd5118.c | 8 +++++++-
>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/hwmon/spd5118.c b/drivers/hwmon/spd5118.c
>>> index 5da44571b6a0..ec9f14f6e0df 100644
>>> --- a/drivers/hwmon/spd5118.c
>>> +++ b/drivers/hwmon/spd5118.c
>>> @@ -512,9 +512,15 @@ static int spd5118_resume(struct device *dev)
>>>    {
>>>        struct spd5118_data *data = dev_get_drvdata(dev);
>>>        struct regmap *regmap = data->regmap;
>>> +     int ret;
>>>
>>>        regcache_cache_only(regmap, false);
>>> -     return regcache_sync(regmap);
>>> +     ret = regcache_sync(regmap);
>>> +     if(ret == -ENXIO || ret == -EIO) {
>>> +             dev_warn(dev, "SPD hub not responding on resume (%d), deferring init\n", ret);
>>> +             return 0;
>>> +     }
>> The specification says that the SPD5118 might take up to 10ms to initialize its i2c interface
>> after power on. Can you test if simply waiting for 10ms before syncing the regcache solves this
>> issue?
>>
>> Thanks,
>> Armin Wolf
>>
>>> +     return ret;
>>>    }
>>>
>>>    static DEFINE_SIMPLE_DEV_PM_OPS(spd5118_pm_ops, spd5118_suspend, spd5118_resume);
> Hi Armin,
>
>> Do you know if the error is caused by the SPD5118 device itself or by the underlying i2c controller?
> The error appears to be caused by the underlying I2C controller and platform
> power sequencing rather than by the SPD5118 device itself.
>
> The failure manifests as a temporary -ENXIO occurring only during s2idle
> resume. The SPD5118 temperature sensor works correctly before suspend and
> after resume once the bus becomes available again. This indicates that the
> driver’s resume callback may be invoked before the I2C controller or firmware
> has fully re-enabled access to the SPD hub.
>
>> Please also share the output of "acpidump" and the name of the i2c
> controller used to communicate with the SPD5118.
>
> I have attached the output of acpidump as requested.
> The SPD5118 is connected via I2C bus 14 and accessed through the Intel
> I801 SMBus controller (0000:00:1f.4), which is ACPI-managed.

Interesting, the ACPI code seems to do two things when the i2c controller suspends (aka is put into D3):

1. A unknown register 0x84 ("PMEC") is modified
2. The PCI BAR of the i2c controller is disabled

Since the PCI bar is not re-enabled during resume, i suspect that either the firmware
is buggy or that the firmware relies on the operating system to restore any BAR settings
when resuming.

I do not know how the PCI core handles suspend, so i CCed the associated maintainers.

>> Can you test if simply waiting for 10ms before syncing the regcache solves this
> issue?
>
> I tested adding an explicit msleep(10) in spd5118_resume() before calling
> regcache_sync(), for the I2C interface to become ready after power-on.
> With this delay in place, the resume failures (-ENXIO during regcache sync)
> no longer occur, and repeated suspend/resume cycles are completed successfully.
>
> However, relying on a fixed delay in the resume path is not robust and would
> not be suitable across platforms with different firmware and power sequencing.
> It also still performs hardware I/O during PM resume.

In this case the 10 ms delay is OK since the specification of the SPD5118 device explicitly
states that the device needs those 10ms to become operational after loosing power.

> Additional evidence comes from running sensors, where all the temperature
> limit and alarm attributes fail with “Can’t read” and temp1 reports N/A,
> after adding msleep(10). All hwmon attributes (temperature input,
> limits, and alarms) fail uniformly, which points to a bus-level access
> failure rather than an issue with specific SPD5118 registers.

Strange, what kind of error is reported then accessing those sysfs attributes? Can you still
access the nvmem part of the SPD5118 device?

Can you also check if accessing tempX_enable works? If yes then please try to set this
attribute to "1" if it is still set to "0".

Additionally, please use "i2cdump" or "i2cdetect" to check if other i2c devices on the same
bus are also affected by this.

> This supports deferring regcache synchronization and avoiding I2C transactions
> in the resume callback, since userspace may attempt to access hwmon
> attributes before the
> bus or device is ready.

As already stated by Guenter, the root cause might be the i2c controller itself. Having
this deferred regcache sync only acts as a workaround, but we strongly prefer having a
real solution.

Thanks,
Armin Wolf


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ