lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 4 Oct 2021 11:58:17 +0200
From:   Boris Brezillon <boris.brezillon@...labora.com>
To:     Sean Nyekjaer <sean@...nix.com>
Cc:     Miquel Raynal <miquel.raynal@...tlin.com>,
        Richard Weinberger <richard@....at>,
        Vignesh Raghavendra <vigneshr@...com>,
        Boris Brezillon <bbrezillon@...nel.org>,
        linux-mtd@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] mtd: rawnand: use mutex to protect access while in
 suspend

On Mon, 4 Oct 2021 10:55:09 +0200
Sean Nyekjaer <sean@...nix.com> wrote:

> On Mon, Oct 04, 2021 at 10:41:47AM +0200, Boris Brezillon wrote:
> > On Mon,  4 Oct 2021 08:56:09 +0200
> > Sean Nyekjaer <sean@...nix.com> wrote:
> >   
> > > This will prevent nand_get_device() from returning -EBUSY.
> > > It will force mtd_write()/mtd_read() to wait for the nand_resume() to unlock
> > > access to the mtd device.
> > > 
> > > Then we avoid -EBUSY is returned to ubifsi via mtd_write()/mtd_read(),
> > > that will in turn hard error on every error returened.
> > > We have seen during ubifs tries to call mtd_write before the mtd device
> > > is resumed.  
> > 
> > I think the problem is here. Why would UBIFS/UBI try to write something
> > to a device that's not resumed yet (or has been suspended already, if
> > you hit this in the suspend path).
> >   
> > > 
> > > Exec_op[0] speed things up, so we see this race when the device is
> > > resuming. But it's actually "mtd: rawnand: Simplify the locking" that
> > > allows it to return -EBUSY, before that commit it would have waited for
> > > the mtd device to resume.  
> > 
> > Uh, wait. If nand_resume() was called before any writes/reads this
> > wouldn't happen. IMHO, the problem is not that we return -EBUSY without
> > blocking, the problem is that someone issues a write/read before calling
> > mtd_resume().
> >   
> 
> The commit msg from "mtd: rawnand: Simplify the locking" states this clearly.
> 
> """
> Last important change to mention: we now return -EBUSY when someone
> tries to access a device that as been suspended, and propagate this
> error to the upper layer.
> """
> 
> IMHO "mtd: rawnand: Simplify the locking" should never had been merged
> before the upper layers was fixed to handle -EBUSY. ;)
> Which they still not are...

That's not really the problem here. Upper layers should never get
-EBUSY in the first place if the MTD device was resumed before the UBI
device. Looks like we have a missing UBI -> MTD parenting link, which
would explain why things don't get resumed in the right order. Can you
try with the following diff applied?

---
diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
index f399edc82191..1981ce8f3a26 100644
--- a/drivers/mtd/ubi/build.c
+++ b/drivers/mtd/ubi/build.c
@@ -905,6 +905,7 @@ int ubi_attach_mtd_dev(struct mtd_info *mtd, int
ubi_num, ubi->dev.release = dev_release;
        ubi->dev.class = &ubi_class;
        ubi->dev.groups = ubi_dev_groups;
+       ubi->dev.parent = &mtd->dev;
 
        ubi->mtd = mtd;
        ubi->ubi_num = ubi_num;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ