lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <04ab699e-b344-4ba1-9ca1-04b6e50beefe@nxsw.ie>
Date: Tue, 03 Jun 2025 10:01:31 +0000
From: Bryan O'Donoghue <bod.linux@...w.ie>
To: Gabor Juhos <j4g8y7@...il.com>, Bryan O'Donoghue <bryan.odonoghue@...aro.org>, Georgi Djakov <djakov@...nel.org>, Raviteja Laggyshetty <quic_rlaggysh@...cinc.com>
Cc: linux-pm@...r.kernel.org, linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held

On 03/06/2025 10:15, Gabor Juhos wrote:
> Hello Bryan,
> 
> Sorry for the late reply, I missed your mail.
> 
> 2025. 05. 30. 11:16 keltezéssel, Bryan O'Donoghue írta:
>> On 29/05/2025 15:46, Gabor Juhos wrote:
>>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
>>> ("interconnect: Fix locking for runpm vs reclaim") in order
>>> to decouple serialization of bw aggregation from codepaths
>>> that require memory allocation.
>>>
>>> However commit d30f83d278a9 ("interconnect: core: Add dynamic
>>> id allocation support") added a devm_kasprintf() call into a
>>> path protected by the 'icc_bw_lock' which causes this lockdep
>>> warning (at least on the IPQ9574 platform):
>>
>> Missing a Fixes tag.
> 
> Erm, it is before my s-o-b tag.

Great thank you I see that.

> ...
> 
>>> Move the memory allocation part of the code outside of the protected
>>> path to eliminate the warning. Also add a note about why it is moved
>>> to there,
>>>
>>> Fixes: d30f83d278a9 ("interconnect: core: Add dynamic id allocation support")
>>> Signed-off-by: Gabor Juhos <j4g8y7@...il.com>
>>> ---
>>>    drivers/interconnect/core.c | 14 ++++++++++----
>>>    1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
>>> index
>>> 1a41e59c77f85a811f78986e98401625f4cadfa3..acdb3b8f1e54942dbb1b71ec2b170b08ad709e6b 100644
>>> --- a/drivers/interconnect/core.c
>>> +++ b/drivers/interconnect/core.c
>>> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct
>>> icc_provider *provider)
>>>            return;
>>>
>>>        mutex_lock(&icc_lock);
>>> +
>>> +    if (node->id >= ICC_DYN_ID_START) {
>>> +        /*
>>> +         * Memory allocation must be done outside of codepaths
>>> +         * protected by icc_bw_lock.
>>> +         */
>>> +        node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
>>> +                        node->name, dev_name(provider->dev));
>>> +    }
>>> +
>>>        mutex_lock(&icc_bw_lock);
>>>
>>>        node->provider = provider;
>>> @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct
>>> icc_provider *provider)
>>>        node->avg_bw = node->init_avg;
>>>        node->peak_bw = node->init_peak;
>>>
>>> -    if (node->id >= ICC_DYN_ID_START)
>>> -        node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
>>> -                        node->name, dev_name(provider->dev));
>>> -
>>>        if (node->avg_bw || node->peak_bw) {
>>>            if (provider->pre_aggregate)
>>>                provider->pre_aggregate(node);
>>>
>>> ---
>>> base-commit: 5fed7fe33c2cd7104fc87b7bc699a7be892befa2
>>> change-id: 20250529-icc-bw-lockdep-ed030d892a19
>>>
>>> Best regards,
>>> --
>>> Gabor Juhos <j4g8y7@...il.com>
>>>
>>>
>>
>> The locking in this code is a mess.
>>
>> Which data-structures does icc_lock protect node* pointers I think and which
>> data-structures does icc_bw_lock protect - "bw" data structures ?
>>
>> Hmm.
>>
>> Looking at this code I'm not sure at all what icc_lock was introduced to do.
> 
> Initially, only the 'icc_lock' mutex was here, and that protected 'everything'.
> The 'icc_bw_lock' has been introduced later by commit af42269c3523
> ("interconnect: Fix locking for runpm vs reclaim") as part of the
> "drm/msm+PM+icc: Make job_run() reclaim-safe" series [1].
> 
> Here is the reason copied from the original commit message:
> 
>      "For cases where icc_bw_set() can be called in callbaths that could
>      deadlock against shrinker/reclaim, such as runpm resume, we need to
>      decouple the icc locking.  Introduce a new icc_bw_lock for cases where
>      we need to serialize bw aggregation and update to decouple that from
>      paths that require memory allocation such as node/link creation/
>      destruction."

Right but reading this code.

icc_set_bw();
icc_lock_bw - protects struct icc_node *

icc_put();
icc_lock - locks
icc_lock_bw -locks directly after protects struct icc_node *

icc_node_add current:
icc_lock - locks
icc_lock_bw - locks
     node->name = devm_kasprintf();

After your change

icc_node_add current:
icc_lock - locks
     node->name = devm_kasprintf();
icc_lock_bw - locks
     owns node->provider - or whatever

And this is what is prompting my question. Which locks own which data here ?

I think we should sort that out, either by removing one of the locks or 
by at the very least documenting beside the mutex declarations which 
locks protect what.

---
bod



>> Can we not just drop it entirely ?
> 
> I'm not an expert in locking, but I doubt that we can easily drop any of the two
> mutexes without reintroducing the problem fixed by the change mentioned above.
> 
> [1] https://lore.kernel.org/all/20230807171148.210181-1-robdclark@gmail.com/
> 
> Regards,
> Gabor
> 
> 

Right - if this were a struct we would declare what these individual 
mutexes lock.

That's my question here, as I review this code, which mutex protects what ?

I don't think that is particularly clear.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ