lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211004085509.iikxtdvxpt6bri5c@skn-laptop>
Date:   Mon, 4 Oct 2021 10:55:09 +0200
From:   Sean Nyekjaer <sean@...nix.com>
To:     Boris Brezillon <boris.brezillon@...labora.com>
Cc:     Miquel Raynal <miquel.raynal@...tlin.com>,
        Richard Weinberger <richard@....at>,
        Vignesh Raghavendra <vigneshr@...com>,
        Boris Brezillon <bbrezillon@...nel.org>,
        linux-mtd@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] mtd: rawnand: use mutex to protect access while in
 suspend

On Mon, Oct 04, 2021 at 10:41:47AM +0200, Boris Brezillon wrote:
> On Mon,  4 Oct 2021 08:56:09 +0200
> Sean Nyekjaer <sean@...nix.com> wrote:
> 
> > This will prevent nand_get_device() from returning -EBUSY.
> > It will force mtd_write()/mtd_read() to wait for the nand_resume() to unlock
> > access to the mtd device.
> > 
> > Then we avoid -EBUSY is returned to ubifsi via mtd_write()/mtd_read(),
> > that will in turn hard error on every error returened.
> > We have seen during ubifs tries to call mtd_write before the mtd device
> > is resumed.
> 
> I think the problem is here. Why would UBIFS/UBI try to write something
> to a device that's not resumed yet (or has been suspended already, if
> you hit this in the suspend path).
> 
> > 
> > Exec_op[0] speed things up, so we see this race when the device is
> > resuming. But it's actually "mtd: rawnand: Simplify the locking" that
> > allows it to return -EBUSY, before that commit it would have waited for
> > the mtd device to resume.
> 
> Uh, wait. If nand_resume() was called before any writes/reads this
> wouldn't happen. IMHO, the problem is not that we return -EBUSY without
> blocking, the problem is that someone issues a write/read before calling
> mtd_resume().
> 

The commit msg from "mtd: rawnand: Simplify the locking" states this clearly.

"""
Last important change to mention: we now return -EBUSY when someone
tries to access a device that as been suspended, and propagate this
error to the upper layer.
"""

IMHO "mtd: rawnand: Simplify the locking" should never had been merged
before the upper layers was fixed to handle -EBUSY. ;)
Which they still not are...

Yes, guess there is data in the ubifs queue when going into suspend,
then the ubifs kthread is starting writing when the cpu resumes.
Before mtd_resume() and other pm_resume() handles are called.

How would you have ubifs to wait for mtd_resume()? If you don't like
this mutex solution?

> > 
> > Tested on a iMX6ULL.
> > 
> > [0]:
> > ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op")
> > 
> > Fixes: 013e6292aaf5 ("mtd: rawnand: Simplify the locking")
> > Signed-off-by: Sean Nyekjaer <sean@...nix.com>
> > ---
> > 
> > I did this a RFC as we probably will need to remove the suspended
> > variable as it's kinda made obsolute by this change.
> > Should we introduce a new mutex? Or maybe a spin_lock?
> > 
> >  drivers/mtd/nand/raw/nand_base.c | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
> > index 3d6c6e880520..0ea343404cac 100644
> > --- a/drivers/mtd/nand/raw/nand_base.c
> > +++ b/drivers/mtd/nand/raw/nand_base.c
> > @@ -4567,7 +4567,6 @@ static int nand_suspend(struct mtd_info *mtd)
> >  		ret = chip->ops.suspend(chip);
> >  	if (!ret)
> >  		chip->suspended = 1;
> > -	mutex_unlock(&chip->lock);
> 
> Hm, I'm not sure keeping the lock when you're in a suspended state
> is a good idea. It just papers over another bug IMO (see above).
> 
> >  
> >  	return ret;
> >  }
> > @@ -4580,7 +4579,6 @@ static void nand_resume(struct mtd_info *mtd)
> >  {
> >  	struct nand_chip *chip = mtd_to_nand(mtd);
> >  
> > -	mutex_lock(&chip->lock);
> >  	if (chip->suspended) {
> >  		if (chip->ops.resume)
> >  			chip->ops.resume(chip);
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ