lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 9 Sep 2015 16:49:36 -0700 From: Brian Norris <computersforpeace@...il.com> To: Alex Smith <alex.smith@...tec.com> Cc: linux-mtd@...ts.infradead.org, Alex Smith <alex@...x-smith.me.uk>, Zubair Lutfullah Kakakhel <Zubair.Kakakhel@...tec.com>, David Woodhouse <dwmw2@...radead.org>, linux-kernel@...r.kernel.org, Niklas Cassel <niklass@...s.com> Subject: Re: [PATCH v5 1/4] mtd: nand: increase ready wait timeout and report timeouts + Niklas On Tue, Sep 08, 2015 at 10:10:50AM +0100, Alex Smith wrote: > If nand_wait_ready() times out, this is silently ignored, and its > caller will then proceed to read from/write to the chip before it is > ready. This can potentially result in corruption with no indication as > to why. > > While a 20ms timeout seems like it should be plenty enough, certain > behaviour can cause it to timeout much earlier than expected. The > situation which prompted this change was that CPU 0, which is > responsible for updating jiffies, was holding interrupts disabled > for a fairly long time while writing to the console during a printk, > causing several jiffies updates to be delayed. If CPU 1 happens to > enter the timeout loop in nand_wait_ready() just before CPU 0 re- > enables interrupts and updates jiffies, CPU 1 will immediately time > out when the delayed jiffies updates are made. The result of this is > that nand_wait_ready() actually waits less time than the NAND chip > would normally take to be ready, and then read_page() proceeds to > read out bad data from the chip. > > The situation described above may seem unlikely, but in fact it can be > reproduced almost every boot on the MIPS Creator Ci20. > > Debugging this was made more difficult by the misleading comment above > nand_wait_ready() stating "The timeout is caught later" - no timeout > was ever reported, leading me away from the real source of the problem. > > Therefore, this patch increases the timeout to 200ms. This should be > enough to cover cases where jiffies updates get delayed. Additionally, > add a pr_warn() when a timeout does occur so that it is easier to > pinpoint any problems in future caused by the chip not becoming ready. Did you examine other solutions? I've seen patches for hrtimer support previously: http://patchwork.ozlabs.org/patch/160333/ http://patchwork.ozlabs.org/patch/431066/ A few things have been cleaned up since then, so some of the initial objections to the hrtimer patch don't make sense anymore, I believe. Anyway, I think just increasing the timeout looks OK to me (as long as we never have a 200ms jiffies jump... can this happen??), so hrtimer may be over-engineering. I just want to make sure both options have been considered before officially choosing one over the other. Brian > Signed-off-by: Alex Smith <alex.smith@...tec.com> > Reviewed-by: Ezequiel Garcia <ezequiel@...guardiasur.com.ar> > Cc: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@...tec.com> > Cc: David Woodhouse <dwmw2@...radead.org> > Cc: Brian Norris <computersforpeace@...il.com> > Cc: linux-mtd@...ts.infradead.org > Cc: linux-kernel@...r.kernel.org > --- > v4 -> v5: > - Remove spurious change. > - Add Ezequiel's Reviewed-by. > > v3 -> v4: > - New patch to fix issue encountered in external Ci20 3.18 kernel > branch which also applies upstream. > --- > drivers/mtd/nand/nand_base.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c > index ceb68ca8277a..07b831b94e5c 100644 > --- a/drivers/mtd/nand/nand_base.c > +++ b/drivers/mtd/nand/nand_base.c > @@ -543,11 +543,16 @@ static void panic_nand_wait_ready(struct mtd_info *mtd, unsigned long timeo) > } > } > > -/* Wait for the ready pin, after a command. The timeout is caught later. */ > +/** > + * nand_wait_ready - [GENERIC] Wait for the ready pin after commands. > + * @mtd: MTD device structure > + * > + * Wait for the ready pin after a command, and warn if a timeout occurs. > + */ > void nand_wait_ready(struct mtd_info *mtd) > { > struct nand_chip *chip = mtd->priv; > - unsigned long timeo = jiffies + msecs_to_jiffies(20); > + unsigned long timeo = jiffies + msecs_to_jiffies(200); > > /* 400ms timeout */ > if (in_interrupt() || oops_in_progress) > @@ -557,9 +562,12 @@ void nand_wait_ready(struct mtd_info *mtd) > /* Wait until command is processed or timeout occurs */ > do { > if (chip->dev_ready(mtd)) > - break; > + goto out; > touch_softlockup_watchdog(); > } while (time_before(jiffies, timeo)); > + > + pr_warn("timeout while waiting for chip to become ready\n"); > +out: > led_trigger_event(nand_led_trigger, LED_OFF); > } > EXPORT_SYMBOL_GPL(nand_wait_ready); > -- > 2.5.0 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists