lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAC=U0a3RXScu12LkU+hCv_5Lp_he92ExRFSgqLkwx40D6Xtrag@mail.gmail.com>
Date:   Thu, 11 Jun 2020 12:04:29 -0400
From:   Kamal Dasu <kdasu.kdev@...il.com>
To:     Miquel Raynal <miquel.raynal@...tlin.com>
Cc:     Brian Norris <computersforpeace@...il.com>,
        Richard Weinberger <richard@....at>,
        Vignesh Raghavendra <vigneshr@...com>,
        MTD Maling List <linux-mtd@...ts.infradead.org>,
        bcm-kernel-feedback-list <bcm-kernel-feedback-list@...adcom.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] mtd: rawnand: brcmnand: Ecc error handling on EDU transfers

On Thu, Jun 11, 2020 at 3:27 AM Miquel Raynal <miquel.raynal@...tlin.com> wrote:
>
> Hi Kamal,
>
> Kamal Dasu <kdasu.kdev@...il.com> wrote on Thu, 11 Jun 2020 01:44:54
> -0400:
>
> > Implemented ECC correctable and uncorrectable error handling for EDU
>
> Implement?
>
> > reads. If ECC correctable bitflips are encountered  on EDU transfer,
>
> extra space                                         ^
>
> > read page again using pio, This is needed due to a controller lmitation
>
> s/pio/PIO/
>
> > where read and corrected data is not transferred to the DMA buffer on ECC
> > errors. This holds true for ECC correctable errors beyond set threshold.
>
> error.
>
> Not sure what the last sentence means?
>

NAND controller allows for setting a correctable  ECC threshold number
of bits beyond which it will actually report the error to the driver.
e.g. for BCH-4 the threshold is 3, so 3-bit and 4-bit errors will
generate correctable ECC interrupt however 1-bit and 2-bit errors will
be corrected silently.
>From the above example EDU hardware will not transfer corrected data
to the DMA buffer for 3-bit and 4-bit errors that get reported. So
once we detect
the error duing EDU we read the page again using pio.

> >
> > Fixes: a5d53ad26a8b ("mtd: rawnand: brcmnand: Add support for flash-edu for dma transfers")
> > Signed-off-by: Kamal Dasu <kdasu.kdev@...il.com>
> > ---
>
> Minor nits below :)
>
> >  drivers/mtd/nand/raw/brcmnand/brcmnand.c | 26 ++++++++++++++++++++++++
> >  1 file changed, 26 insertions(+)
> >
> > diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > index 0c1d6e543586..d7daa83c8a58 100644
> > --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
> > @@ -1855,6 +1855,22 @@ static int brcmnand_edu_trans(struct brcmnand_host *host, u64 addr, u32 *buf,
> >       edu_writel(ctrl, EDU_STOP, 0); /* force stop */
> >       edu_readl(ctrl, EDU_STOP);
> >
> > +     if (ret == 0 && edu_cmd == EDU_CMD_READ) {
>
> !ret
>
> > +             u64 err_addr = 0;
> > +
> > +             /*
> > +              * check for ecc errors here, subpage ecc erros are
> > +              * retained in ecc error addr register
>
> s/ecc/ECC/
> s/erros/errors/
> s/addr/address/
>
> > +              */
> > +             err_addr = brcmnand_get_uncorrecc_addr(ctrl);
> > +             if (!err_addr) {
> > +                     err_addr = brcmnand_get_correcc_addr(ctrl);
> > +                     if (err_addr)
> > +                             ret = -EUCLEAN;
> > +             } else
> > +                     ret = -EBADMSG;
>
> I don't like very much to see these values being used within NAND
> controller drivers but I see it's already the cause, so I guess I can
> live with that.
>
> > +     }
> > +
> >       return ret;
> >  }
> >
> > @@ -2058,6 +2074,7 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip,
> >       u64 err_addr = 0;
> >       int err;
> >       bool retry = true;
> > +     bool edu_read = false;
> >
> >       dev_dbg(ctrl->dev, "read %llx -> %p\n", (unsigned long long)addr, buf);
> >
> > @@ -2075,6 +2092,10 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip,
> >                       else
> >                               return -EIO;
> >               }
> > +
> > +             if (has_edu(ctrl))
> > +                     edu_read = true;
>
> You don't need this extra value, you already have the cmd parameter
> which tells you if it is a read or a write. You might even want to
> create a if block so set dir and edu_cmd and eventually a local
> edu_read if you think it still makes sense.
>

I needed the value since dma and edu read has multiple conditions like
oob is not included, buffer is aligned, virtual address is good. This
indicates to
the if (mtd_is_bitflip(err))  block that the error was from an edu
transaction that happened.This way all ecc error handling for dma,
edu, pio is in one place.
Also there is more controller version specific logic for read error
handling in there and this allows us to maintain the hierarchy how we
handle both correctable
and uncorrectable error.

> > +
> >       } else {
> >               if (oob)
> >                       memset(oob, 0x99, mtd->oobsize);
> > @@ -2122,6 +2143,11 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip,
> >       if (mtd_is_bitflip(err)) {
> >               unsigned int corrected = brcmnand_count_corrected(ctrl);
> >
> > +             /* in case of edu correctable error we read again using pio */
>
> s/edu/EDU/ ?
> s/pio/PIO/
>
> > +             if (edu_read)
> > +                     err = brcmnand_read_by_pio(mtd, chip, addr, trans, buf,
> > +                                                oob, &err_addr);
> > +
> >               dev_dbg(ctrl->dev, "corrected error at 0x%llx\n",
> >                       (unsigned long long)err_addr);
> >               mtd->ecc_stats.corrected += corrected;
>

Will fix all the other typos.

> Thanks,
> Miquèl

Thanks
Kamal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ