lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFK2Kl2I46dTYBI1@hovoldconsulting.com>
Date: Wed, 18 Jun 2025 14:50:50 +0200
From: Johan Hovold <johan@...nel.org>
To: Bryan O'Donoghue <bod.linux@...w.ie>,
	Rob Clark <rob.clark@....qualcomm.com>
Cc: Gabor Juhos <j4g8y7@...il.com>,
	Bryan O'Donoghue <bryan.odonoghue@...aro.org>,
	Georgi Djakov <djakov@...nel.org>,
	Raviteja Laggyshetty <quic_rlaggysh@...cinc.com>,
	linux-pm@...r.kernel.org, linux-arm-msm@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock'
 is held

[ +CC: Rob ]

On Tue, Jun 03, 2025 at 10:01:31AM +0000, Bryan O'Donoghue wrote:
> On 03/06/2025 10:15, Gabor Juhos wrote:

> > 2025. 05. 30. 11:16 keltezéssel, Bryan O'Donoghue írta:
> >> On 29/05/2025 15:46, Gabor Juhos wrote:
> >>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
> >>> ("interconnect: Fix locking for runpm vs reclaim") in order
> >>> to decouple serialization of bw aggregation from codepaths
> >>> that require memory allocation.
> >>>
> >>> However commit d30f83d278a9 ("interconnect: core: Add dynamic
> >>> id allocation support") added a devm_kasprintf() call into a
> >>> path protected by the 'icc_bw_lock' which causes this lockdep
> >>> warning (at least on the IPQ9574 platform):

> >>> Move the memory allocation part of the code outside of the protected
> >>> path to eliminate the warning. Also add a note about why it is moved
> >>> to there,

> >>> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct
> >>> icc_provider *provider)
> >>>            return;
> >>>
> >>>        mutex_lock(&icc_lock);
> >>> +
> >>> +    if (node->id >= ICC_DYN_ID_START) {
> >>> +        /*
> >>> +         * Memory allocation must be done outside of codepaths
> >>> +         * protected by icc_bw_lock.
> >>> +         */
> >>> +        node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
> >>> +                        node->name, dev_name(provider->dev));
> >>> +    }
> >>> +
> >>>        mutex_lock(&icc_bw_lock);
> >>>
> >>>        node->provider = provider;
> >>> @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct
> >>> icc_provider *provider)
> >>>        node->avg_bw = node->init_avg;
> >>>        node->peak_bw = node->init_peak;
> >>>
> >>> -    if (node->id >= ICC_DYN_ID_START)
> >>> -        node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
> >>> -                        node->name, dev_name(provider->dev));
> >>> -
> >>>        if (node->avg_bw || node->peak_bw) {
> >>>            if (provider->pre_aggregate)
> >>>                provider->pre_aggregate(node);

> >> The locking in this code is a mess.
> >>
> >> Which data-structures does icc_lock protect node* pointers I think and which
> >> data-structures does icc_bw_lock protect - "bw" data structures ?
> >>
> >> Hmm.
> >>
> >> Looking at this code I'm not sure at all what icc_lock was introduced to do.
> > 
> > Initially, only the 'icc_lock' mutex was here, and that protected 'everything'.
> > The 'icc_bw_lock' has been introduced later by commit af42269c3523
> > ("interconnect: Fix locking for runpm vs reclaim") as part of the
> > "drm/msm+PM+icc: Make job_run() reclaim-safe" series [1].
> > 
> > Here is the reason copied from the original commit message:
> > 
> >      "For cases where icc_bw_set() can be called in callbaths that could
> >      deadlock against shrinker/reclaim, such as runpm resume, we need to
> >      decouple the icc locking.  Introduce a new icc_bw_lock for cases where
> >      we need to serialize bw aggregation and update to decouple that from
> >      paths that require memory allocation such as node/link creation/
> >      destruction."
> 
> Right but reading this code.
> 
> icc_set_bw();
> icc_lock_bw - protects struct icc_node *
> 
> icc_put();
> icc_lock - locks
> icc_lock_bw -locks directly after protects struct icc_node *
> 
> icc_node_add current:
> icc_lock - locks
> icc_lock_bw - locks
>      node->name = devm_kasprintf();
> 
> After your change
> 
> icc_node_add current:
> icc_lock - locks
>      node->name = devm_kasprintf();
> icc_lock_bw - locks
>      owns node->provider - or whatever
> 
> And this is what is prompting my question. Which locks own which data here ?
> 
> I think we should sort that out, either by removing one of the locks or 
> by at the very least documenting beside the mutex declarations which 
> locks protect what.

Feel free to discuss that with Rob who added the icc_lock_bw, but it's
unrelated to the regression at hand (and should not block fixing it).

Allocations cannot be done while holding the icc_lock_bw, and this fix
is correct in moving the allocation (also note that the node has not
been added yet).

Johan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ