[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250929142856.540590-1-a.shimko.dev@gmail.com>
Date: Mon, 29 Sep 2025 17:28:55 +0300
From: Artem Shimko <artyom.shimko@...il.com>
To: Sudeep Holla <sudeep.holla@....com>,
Cristian Marussi <cristian.marussi@....com>
Cc: a.shimko.dev@...il.com,
arm-scmi@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: [PATCH] drivers: scmi: Add completion timeout handling for raw mode transfers
Fix race conditions in SCMI raw mode implementation by adding proper
completion timeout handling. Multiple tests[1] in the SCMI test suite
were failing due to early clearing of SCMI_XFER_FLAG_IS_RAW flag in
scmi_xfer_raw_put() function.
TRANS=raw
PROTOCOLS=base,clock,power_domain,performance,system_power,sensor,
voltage,reset,powercap,pin_control VERBOSE=5
The root cause:
Tests were failing on poll() system calls with this condition:
if (!raw || (idx == SCMI_RAW_REPLY_QUEUE && !SCMI_XFER_IS_RAW(xfer)))
return;
The SCMI_XFER_FLAG_IS_RAW flag was being cleared prematurely before
the transfer completion was properly acknowledged, causing the poll
to return on timeout and tests to fail.
Сhanges implemented:
1. Add completion wait with timeout in scmi_xfer_raw_worker()
2. Signal completion in scmi_raw_message_report()
This ensures:
- Proper synchronization between transfer completion and flag clearing
- Prevention of indefinite blocking with timeout safety mechanism
- Stable test execution by maintaining correct flag states
TRANS=raw
PROTOCOLS=base,clock,power_domain,performance,system_power,sensor,
voltage,reset,powercap,pin_control VERBOSE=5
An example of a random test failure:
817: Voltage get ext name for invalid domain
[Check 1] Get extended name for invalid domain
MSG HDR : 0x04585c09
NUM PARAM : 1
PARAMETER[00] : 0x0000000c
CHECK STATUS : PASSED [SCMI_NOT_FOUND_ERR]
CHECK HEADER : PASSED [0x04585c09]
RETURN COUNT : 0
NUM DOMAINS : 11
VOLTAGE DOMAIN : 0
[Check 2] Get extended name for unsupp. domain
MSG HDR : 0x045c5c09
NUM PARAM : 1
PARAMETER[00] : 0x00000000
CHECK STATUS : FAILED
EXPECTED : SCMI_NOT_FOUND_ERR
RECEIVED : SCMI_GENERIC_ERROR : NON CONFORMANT
After making these changes, the tests stopped failing.
mount -t debugfs none /sys/kernel/debug
scmi_test_agent
[ 127.865032] arm-scmi arm-scmi.1.auto: Resetting SCMI Raw stack.
[ 128.360503] arm-scmi arm-scmi.1.auto: Using Base channel for protocol 0x12
tail -n 6 arm_scmi_test_log.txt
****************************************************
TOTAL TESTS: 167 PASSED: 120 FAILED: 0 SKIPPED: 47
****************************************************
Link [1] https://gitlab.arm.com/tests/scmi-tests/-/releases
Signed-off-by: Artem Shimko <a.shimko.dev@...il.com>
---
Hello maintainers and reviewers,
This patch addresses a race condition in the SCMI raw mode implementation
that was causing multiple test failures in the SCMI test suite.
The issue manifested as poll() timeouts in tests when using raw mode
transfers. The root cause was premature completion signaling and
SCMI_XFER_FLAG_IS_RAW flag clearing before transfers were fully
acknowledged.
Thank you for your consideration.
Best regards,
Artem Shimko
drivers/firmware/arm_scmi/raw_mode.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/firmware/arm_scmi/raw_mode.c b/drivers/firmware/arm_scmi/raw_mode.c
index 73db5492ab44..130d45192beb 100644
--- a/drivers/firmware/arm_scmi/raw_mode.c
+++ b/drivers/firmware/arm_scmi/raw_mode.c
@@ -468,6 +468,14 @@ static void scmi_xfer_raw_worker(struct work_struct *work)
ret = scmi_xfer_raw_wait_for_message_response(cinfo, xfer,
timeout_ms);
+ if (!ret)
+ if (!wait_for_completion_timeout(&xfer->done, timeout_ms)) {
+ dev_err(dev,
+ "timed out in RAW resp - HDR:%08X\n",
+ pack_scmi_header(&xfer->hdr));
+ ret = -ETIMEDOUT;
+ }
+
if (!ret && xfer->hdr.status)
ret = scmi_to_linux_errno(xfer->hdr.status);
@@ -1381,6 +1389,8 @@ void scmi_raw_message_report(void *r, struct scmi_xfer *xfer,
if (!raw || (idx == SCMI_RAW_REPLY_QUEUE && !SCMI_XFER_IS_RAW(xfer)))
return;
+ complete(&xfer->done);
+
dev = raw->handle->dev;
q = scmi_raw_queue_select(raw, idx,
SCMI_XFER_IS_CHAN_SET(xfer) ? chan_id : 0);
--
2.43.0
Powered by blists - more mailing lists