linux-kernel - Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND Flash Controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181120133624.3fa4742d@xps13>
Date:   Tue, 20 Nov 2018 13:36:24 +0100
From:   Miquel Raynal <miquel.raynal@...tlin.com>
To:     Boris Brezillon <boris.brezillon@...tlin.com>
Cc:     Naga Sureshkumar Relli <nagasure@...inx.com>,
        "richard@....at" <richard@....at>,
        "dwmw2@...radead.org" <dwmw2@...radead.org>,
        "computersforpeace@...il.com" <computersforpeace@...il.com>,
        "marek.vasut@...il.com" <marek.vasut@...il.com>,
        "linux-mtd@...ts.infradead.org" <linux-mtd@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "nagasuresh12@...il.com" <nagasuresh12@...il.com>,
        "robh@...nel.org" <robh@...nel.org>,
        Michal Simek <michals@...inx.com>
Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for
 Arasan NAND Flash Controller

Hi Naga,

Boris Brezillon <boris.brezillon@...tlin.com> wrote on Tue, 20 Nov 2018
12:02:44 +0100:

> On Tue, 20 Nov 2018 07:02:08 +0000
> Naga Sureshkumar Relli <nagasure@...inx.com> wrote:
> 
> 
> > > 
> > > Can you please run nandbiterrs (availaible in mtd-utils). I fear your
> > > device won't pass the test.    
> > Yes, nandbiterror test is passing till 24bit, after that it is failing.  
> 
> Can you paste the output of nandbiterrs please?

Apparently 'nandbiterrs -i 'just crashes the kernel because of a
segmentation fault. Please run this test (from the mtd-utils package)
and fix this issue. Then we would like to see the output.

> 
> > >     
> > > > But we are hitting this because of erased page reading(needed in case of ubifs).
> > > >    
> > > > >
> > > > > Don't you have a bit (or several bits) reporting when the ECC engine was not able to    
> > > correct    
> > > > > data? I you do, you should base the "detect bitflips in erase pages" logic on this information.    
> > > > Bit reporting for several bit errors is there only for Hamming(1bit correction and 2bit    
> > > detection) but not in BCH.    
> > > >    
> > > 
> > > Then I tend to agree with Miquel: your ECC engine is broken, and I'm
> > > not even sure how to deal with that yet.    
> > So as per the Miquel's suggestion, can I proceed to add the below one?
> > "you should re-read the page in raw mode and check for the number of bitflips manually (thanks to the helpers in the core). Again, if the number of BF is above 16, we can assume the page is bad and increment ->ecc.failed accordingly."  
> 
> But that's just partially fixing the problem. And you didn't answer my
> previous question: what happens when you configure the ECC engine in,
> say 12bit/1024 and you end up with uncorrectable errors (more than 12
> bitflips in a 1k block). What's the number reported ECC_ERR_CNT? Is it
> set to 13?

Please dump this register, and eventually what's the value of the
Packet_bound_Err_count field ([0:7]) for each iteration of nandbiterrs -i.
If there is no way, when the status bit is set, to discriminate if the
data is reliable or was not corrected at all, it is gonna be a real
issue and I don't think we want to support such engine.


Thanks,
Miquèl