[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <b1093de5-9f62-6714-0063-7c719dc4f6ca@i2se.com>
Date: Wed, 26 Apr 2023 15:39:15 +0200
From: Stefan Wahren <stefan.wahren@...e.com>
To: Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>,
Akira Shimahara <akira215corp@...il.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Stefan Wahren <stefan.wahren@...rgebyte.com>,
regressions@...ts.linux.dev
Subject: Regression: w1_therm: sysfs w1_slave sometimes report 85 degrees
Celsius
Hi,
recently we switch on our Tarragon board (i.MX6ULL) to Linux 6.1 and
noticed that the connected 1-wire temperature sensors
(w1_therm.w1_strong_pull=0) sometimes (~ 1 of 20 times) report 85
degrees Celsius, which is AFAIK the only way to report errors to the
1-wire master:
sys/bus/w1/devices/28-04168158faff# cat w1_slave
50 05 4b 46 7f ff 0c 10 1c : crc=1c YES
50 05 4b 46 7f ff 0c 10 1c t=85000
I wasn't able to reproduce this issue with the old kernel 4.9.
After that i successfully bisected the issue to this commit:
67b392f7b8ed ("w1_therm: optimizing temperature read timings")
Unfortunately this commit contains a lot of independent changes, which
makes it hard to figured out the cause of this issue. So i tried to
split this patch in seven independent changes [1]. Now i was able to
bisect the cause further to this change [2] which seems to rework the
pullup handling within read_therm().
Looking closer at the code change and verify it some debug messages, the
change inverted the locking behavior (before: no pullup -> keep lock,
after: no pullup -> release lock during sleep).
Before:
if (external_power) {
mutex_unlock(&dev_master->bus_mutex);
sleep_rem = msleep_interruptible(tm);
if (sleep_rem != 0) {
ret = -EINTR;
goto dec_refcnt;
}
ret = mutex_lock_interruptible(&dev_master->bus_mutex);
if (ret != 0)
goto dec_refcnt;
} else if (!w1_strong_pullup) {
sleep_rem = msleep_interruptible(tm);
if (sleep_rem != 0) {
ret = -EINTR;
goto mt_unlock;
}
}
After:
if (strong_pullup) { /*some device need pullup */
sleep_rem = msleep_interruptible(tm);
if (sleep_rem != 0) {
ret = -EINTR;
goto mt_unlock;
}
} else { /*no device need pullup */
mutex_unlock(&dev_master->bus_mutex);
sleep_rem = msleep_interruptible(tm);
if (sleep_rem != 0) {
ret = -EINTR;
goto dec_refcnt;
}
ret = mutex_lock_interruptible(&dev_master->bus_mutex);
if (ret != 0)
goto dec_refcnt;
}
I don't believe this is intended. After inverting the strong_pullup
check, the issue wasn't reproducible on our platform anymore. But i'm
not sure this is clean.
Best regards
#regzbot introduced: 67b392f7b8ed
[1] - https://github.com/chargebyte/linux/commits/v6.1-tarragon_w1
[2] -
https://github.com/chargebyte/linux/commit/17ca863a32a6a1bdd376959f05c954bef12fc1b5
Powered by blists - more mailing lists