[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250625-icc-bw-lockdep-v3-1-2b8f8b8987c4@gmail.com>
Date: Wed, 25 Jun 2025 13:25:04 +0200
From: Gabor Juhos <j4g8y7@...il.com>
To: Georgi Djakov <djakov@...nel.org>,
Raviteja Laggyshetty <quic_rlaggysh@...cinc.com>,
Johan Hovold <johan+linaro@...nel.org>,
Bryan O'Donoghue <bryan.odonoghue@...aro.org>
Cc: linux-pm@...r.kernel.org, linux-arm-msm@...r.kernel.org,
linux-kernel@...r.kernel.org, Gabor Juhos <j4g8y7@...il.com>
Subject: [PATCH v3] interconnect: avoid memory allocation when
'icc_bw_lock' is held
The 'icc_bw_lock' mutex is introduced in commit af42269c3523
("interconnect: Fix locking for runpm vs reclaim") in order
to decouple serialization of bw aggregation from codepaths
that require memory allocation.
However commit d30f83d278a9 ("interconnect: core: Add dynamic
id allocation support") added a devm_kasprintf() call into a
path protected by the 'icc_bw_lock' which causes this lockdep
warning (at least on the IPQ9574 platform):
======================================================
WARNING: possible circular locking dependency detected
6.15.0-next-20250529 #0 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
ffffffc081df57d8 (icc_bw_lock){+.+.}-{4:4}, at: icc_init+0x8/0x108
but task is already holding lock:
ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (fs_reclaim){+.+.}-{0:0}:
fs_reclaim_acquire+0x7c/0xb8
slab_alloc_node.isra.0+0x48/0x188
__kmalloc_node_track_caller_noprof+0xa4/0x2b8
devm_kmalloc+0x5c/0x138
devm_kvasprintf+0x6c/0xb8
devm_kasprintf+0x50/0x68
icc_node_add+0xbc/0x160
icc_clk_register+0x15c/0x230
devm_icc_clk_register+0x20/0x90
qcom_cc_really_probe+0x320/0x338
nss_cc_ipq9574_probe+0xac/0x1e8
platform_probe+0x70/0xd0
really_probe+0xdc/0x3b8
__driver_probe_device+0x94/0x178
driver_probe_device+0x48/0xf0
__driver_attach+0x13c/0x208
bus_for_each_dev+0x6c/0xb8
driver_attach+0x2c/0x40
bus_add_driver+0x100/0x250
driver_register+0x68/0x138
__platform_driver_register+0x2c/0x40
nss_cc_ipq9574_driver_init+0x24/0x38
do_one_initcall+0x88/0x340
kernel_init_freeable+0x2ac/0x4f8
kernel_init+0x28/0x1e8
ret_from_fork+0x10/0x20
-> #0 (icc_bw_lock){+.+.}-{4:4}:
__lock_acquire+0x1348/0x2090
lock_acquire+0x108/0x2d8
icc_init+0x50/0x108
do_one_initcall+0x88/0x340
kernel_init_freeable+0x2ac/0x4f8
kernel_init+0x28/0x1e8
ret_from_fork+0x10/0x20
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(icc_bw_lock);
lock(fs_reclaim);
lock(icc_bw_lock);
*** DEADLOCK ***
1 lock held by swapper/0/1:
#0: ffffffc081d7db10 (fs_reclaim){+.+.}-{0:0}, at: icc_init+0x28/0x108
stack backtrace:
CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.15.0-next-20250529 #0 NONE
Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
Call trace:
show_stack+0x20/0x38 (C)
dump_stack_lvl+0x90/0xd0
dump_stack+0x18/0x28
print_circular_bug+0x334/0x448
check_noncircular+0x12c/0x140
__lock_acquire+0x1348/0x2090
lock_acquire+0x108/0x2d8
icc_init+0x50/0x108
do_one_initcall+0x88/0x340
kernel_init_freeable+0x2ac/0x4f8
kernel_init+0x28/0x1e8
ret_from_fork+0x10/0x20
The icc_node_add() functions is not designed to fail, and as such it
should not do any memory allocation. In order to avoid this, move the
name generation directly into the functions which are using the dynamic
id feature.
The change in the icc core has been tested on the IPQ9574 platform,
where it eliminates the lockdep splat indicated above. The changes in
the 'icc-rpmh' and 'osm-l3' drivers are compile tested only due to lack
of suitable hardware.
Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
Signed-off-by: Gabor Juhos <j4g8y7@...il.com>
---
Changes in v3:
- move memory allocation out from the icc_node_add() function directly into
the users of the dynamic id feature
- adjust commit description according to the changes
- Link to v2: https://lore.kernel.org/r/20250618-icc-bw-lockdep-v2-1-3223da346765@gmail.com
Changes in v2:
- move memory allocation outside of icc_lock
- issue a warning and return without modifying the node name in case of
memory allocation failure, and adjust the commit description
- remove offered tags from Johan and Bryan
Note: since I was not sure that that the added WARN_ON() is a substantial
change or not, I have removed the offered tags intentionally to be on the
safe side
- Link to v1: https://lore.kernel.org/r/20250529-icc-bw-lockdep-v1-1-3d714b6a9374@gmail.com
---
drivers/interconnect/core.c | 4 ----
drivers/interconnect/qcom/icc-rpmh.c | 20 ++++++++++++++++++--
drivers/interconnect/qcom/osm-l3.c | 10 +++++++++-
3 files changed, 27 insertions(+), 7 deletions(-)
diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index 1a41e59c77f85a811f78986e98401625f4cadfa3..f0bdcdcf222af133fcb0a346fa347c5a829e62e6 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -1038,10 +1038,6 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider)
node->avg_bw = node->init_avg;
node->peak_bw = node->init_peak;
- if (node->id >= ICC_DYN_ID_START)
- node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
- node->name, dev_name(provider->dev));
-
if (node->avg_bw || node->peak_bw) {
if (provider->pre_aggregate)
provider->pre_aggregate(node);
diff --git a/drivers/interconnect/qcom/icc-rpmh.c b/drivers/interconnect/qcom/icc-rpmh.c
index 41bfc6e7ee1d53d34b919dd8afa97698bc69d79c..fa4ef78678eff10e83557035ba572010b51ff50c 100644
--- a/drivers/interconnect/qcom/icc-rpmh.c
+++ b/drivers/interconnect/qcom/icc-rpmh.c
@@ -276,13 +276,17 @@ int qcom_icc_rpmh_probe(struct platform_device *pdev)
qcom_icc_bcm_init(qp->bcms[i], dev);
for (i = 0; i < num_nodes; i++) {
+ bool is_dyn_node = false;
+
qn = qnodes[i];
if (!qn)
continue;
if (desc->alloc_dyn_id) {
- if (!qn->node)
+ if (!qn->node) {
qn->node = icc_node_create_dyn();
+ is_dyn_node = true;
+ }
node = qn->node;
} else {
node = icc_node_create(qn->id);
@@ -293,7 +297,19 @@ int qcom_icc_rpmh_probe(struct platform_device *pdev)
goto err_remove_nodes;
}
- node->name = qn->name;
+ if (is_dyn_node) {
+ node->name = devm_kasprintf(provider->dev, GFP_KERNEL,
+ "%s@%s", qn->name,
+ dev_name(provider->dev));
+ if (!node->name) {
+ icc_node_destroy(node->id);
+ ret = -ENOMEM;
+ goto err_remove_nodes;
+ }
+ } else {
+ node->name = qn->name;
+ }
+
node->data = qn;
icc_node_add(node, provider);
diff --git a/drivers/interconnect/qcom/osm-l3.c b/drivers/interconnect/qcom/osm-l3.c
index baecbf2533f76cbf92bb2451979c4db57f8e4a2b..ed59fa73f0d70a5dfdb658ff606ef82977f04bcb 100644
--- a/drivers/interconnect/qcom/osm-l3.c
+++ b/drivers/interconnect/qcom/osm-l3.c
@@ -236,7 +236,15 @@ static int qcom_osm_l3_probe(struct platform_device *pdev)
goto err;
}
- node->name = qnodes[i]->name;
+ node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
+ qnodes[i]->name,
+ dev_name(provider->dev));
+ if (!node->name) {
+ icc_node_destroy(node->id);
+ ret = -ENOMEM;
+ goto err;
+ }
+
/* Cast away const and add it back in qcom_osm_l3_set() */
node->data = (void *)qnodes[i];
icc_node_add(node, provider);
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250529-icc-bw-lockdep-ed030d892a19
Best regards,
--
Gabor Juhos <j4g8y7@...il.com>
Powered by blists - more mailing lists