linux-kernel - Re: regulator: deadlock vs memory reclaim

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <08f030a2-3a6f-3ab4-1855-3016884db79d@gmail.com>
Date:   Mon, 10 Aug 2020 22:41:54 +0300
From:   Dmitry Osipenko <digetx@...il.com>
To:     Michał Mirosław <mirq-linux@...e.qmqm.pl>,
        Mark Brown <broonie@...nel.org>
Cc:     Liam Girdwood <lgirdwood@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: regulator: deadlock vs memory reclaim

10.08.2020 22:25, Michał Mirosław пишет:
>>>>> regulator_lock_dependent() starts by taking regulator_list_mutex, The
>>>>> same mutex covers eg. regulator initialization, including memory allocations
>>>>> that happen there. This will deadlock when you have filesystem on eg. eMMC
>>>>> (which uses a regulator to control module voltages) and you register
>>>>> a new regulator (hotplug a device?) when under memory pressure.
>>>> OK, that's very much a corner case, it only applies in the case of
>>>> coupled regulators.  The most obvious thing here would be to move the
>>>> allocations on registration out of the locked region, we really only
>>>> need this in the regulator_find_coupler() call I think.  If the
>>>> regulator isn't coupled we don't need to take the lock at all.
>>> Currently, regulator_lock_dependent() is called by eg. regulator_enable() and
>>> regulator_get_voltage(), so actually any regulator can deadlock this way.
>> The initialization cases that are the trigger are only done for coupled
>> regulators though AFAICT, otherwise we're not doing allocations with the
>> lock held and should be able to progress.
> 
> I caught a few lockdep complaints that suggest otherwise, but I'm still
> looking into that.

The problem looks obvious to me. The regulator_init_coupling() is
protected with the list_mutex, the regulator_lock_dependent() also
protected with the list_mutex. Hence if offending reclaim happens from
init_coupling(), then there is a lockup.

1. mutex_lock(&regulator_list_mutex);

2. regulator_init_coupling()

3. kzalloc()

4. reclaim ...

5. regulator_get_voltage()

6. regulator_lock_dependent()

7. mutex_lock(&regulator_list_mutex);

It should be enough just to keep the regulator_find_coupler() under
lock, or even completely remove the locking around init_coupling(). I
think it should be better to keep the find_coupler() protected.

Michał, does this fix yours problem?

--- >8 ---
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 75ff7c563c5d..513f95c6f837 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -5040,7 +5040,10 @@ static int regulator_init_coupling(struct
regulator_dev *rdev)
 	if (!of_check_coupling_data(rdev))
 		return -EPERM;

+	mutex_lock(&regulator_list_mutex);
 	rdev->coupling_desc.coupler = regulator_find_coupler(rdev);
+	mutex_unlock(&regulator_list_mutex);
+
 	if (IS_ERR(rdev->coupling_desc.coupler)) {
 		err = PTR_ERR(rdev->coupling_desc.coupler);
 		rdev_err(rdev, "failed to get coupler: %d\n", err);
@@ -5248,9 +5251,7 @@ regulator_register(const struct regulator_desc
*regulator_desc,
 	if (ret < 0)
 		goto wash;

-	mutex_lock(&regulator_list_mutex);
 	ret = regulator_init_coupling(rdev);
-	mutex_unlock(&regulator_list_mutex);
 	if (ret < 0)
 		goto wash;


--- >8 ---