linux-kernel - Re: Regulator regression in next-20180305

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180306163035.GE13586@sirena.org.uk>
Date:   Tue, 6 Mar 2018 16:30:35 +0000
From:   Mark Brown <broonie@...nel.org>
To:     Fabio Estevam <festevam@...il.com>
Cc:     Tony Lindgren <tony@...mide.com>,
        Maciej Purski <m.purski@...sung.com>,
        linux-omap@...r.kernel.org,
        linux-kernel <linux-kernel@...r.kernel.org>,
        "moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE" 
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: Regulator regression in next-20180305

On Mon, Mar 05, 2018 at 08:22:26PM -0300, Fabio Estevam wrote:
> On Mon, Mar 5, 2018 at 8:12 PM, Tony Lindgren <tony@...mide.com> wrote:

> > Looks like with next-20180305 there's a regulator regression
> > where mmc0 won't show any cards or produces errors:

> > mmcblk0: error -110 requesting status
> > mmc1: new high speed SDIO card at address 0001
> > mmcblk0: error -110 requesting status
> > mmcblk0: recovery failed!
> > print_req_error: I/O error, dev mmcblk0, sector 0
> > Buffer I/O error on dev mmcblk0, logical block 0, async page read
> > mmcblk0: error -110 requesting status
> > mmcblk0: recovery failed!

No other error messages?  That seems like there's something going on
that's very different to what Fabio was reporting...  I'm guessing some
voltage application didn't go through but it's hard to tell with so
little data.  dra7 does seem to have what Fabio had though so there's
definitely some effect on the OMAP platforms.

> I have also seen regulator issues due to this series:
> https://lkml.org/lkml/2018/3/5/731

Looking at your stuff I'm having trouble figuring out what's going on -
we're getting double locking of a parent regulator during enable
according to your backtraces but it's not clear to me what took that
lock already.  regulator_enable() walks the supplies before it takes
the lock on the regulator it's immediately being called on, not holding
any locks on supplies while enabling.  regulator_balance_voltage() then
tries to lock the supplies again but lockdep says the lock is already
held by regulator_enable().  It's also weird that this doesn't seem to
be showing up on other boards in kernelci, the regulator setup on those
i.MX boards looks to be quite simple so I'd expect a much wider impact.

I'm wondering if your case is more pain from mutex_lock_nested(), both
regulator_lock_coupled() and regulator_lock_supply() will end up using 
indexes starting at 0 for the locking classes.  That doesn't smell right
though, but in case my straw clutching works:

If we can't figure it out I'll just drop the series but I'd prefer to at
least understand what's going on.

diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index e685f8b94acf..2c5b20a97f51 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -159,7 +159,7 @@ static void regulator_lock_supply(struct regulator_dev *rdev)
 {
 	int i;

-	for (i = 0; rdev; rdev = rdev_get_supply(rdev), i++)
+	for (i = 1000; rdev; rdev = rdev_get_supply(rdev), i++)
 		mutex_lock_nested(&rdev->mutex, i);
 }

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)