lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4952bf96840ae5b0caba7b8f472e2b1b@agner.ch>
Date:   Mon, 05 Feb 2018 23:16:57 +0100
From:   stefan@...er.ch
To:     Boris Brezillon <boris.brezillon@...e-electrons.com>,
        shijie.huang@....com
Cc:     han.xu@....com, max.oss.09@...il.com, richard@....at,
        linux-kernel@...r.kernel.org, marek.vasut@...il.com,
        linux-mtd@...ts.infradead.org, cyrille.pitchen@...ev4u.fr,
        dwmw2@...radead.org
Subject: Re: [PATCH] mtd: nand: gpmi: fall back to legacy mode if no ECC
 information present

Hi Boris,

[Also adding Huang]

On 31.01.2018 22:18, stefan@...er.ch wrote:
> I accidentally removed ML/cc before, re-adding.
> 
> On 31.01.2018 10:57, Boris Brezillon wrote:
>> On Wed, 31 Jan 2018 10:19:05 +0100
>> stefan@...er.ch wrote:
>>
>>> On 30.01.2018 14:23, Boris Brezillon wrote:
>>> > Hi Stefan,
>>> >
>>> > On Mon, 29 Jan 2018 15:44:40 +0100
>>> > Stefan Agner <stefan@...er.ch> wrote:
>>> >
>>> >> In case fsl,use-minimum-ecc is set, the driver tries to determine
>>> >> ECC layout by using the ECC information provided by the MTD stack.
>>> >> However, in case the NAND chip does not provide any information,
>>> >> the driver currently fails with:
>>> >>   nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xf1
>>> >>   nand: Macronix NAND 128MiB 3,3V 8-bit
>>> >>   nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>>> >>   gpmi-nand 1806000.gpmi-nand: Error setting BCH geometry : 1
>>> >>   gpmi-nand: probe of 1806000.gpmi-nand failed with error 1
>>> >>
>>> >> Fall back to implementation specific default mode if no ECC
>>> >> information are provided by the NAND chip and fsl,use-minimum-ecc
>>> >> is specified.
>>> >
>>> > Hm, this sounds a bit fragile: if we ever fix the Macronix driver
>>> > (which should be done BTW) to set the appropriate ECC requirements, it
>>> > will break all platforms that were relying on this 'fall back to legacy
>>> > logic'.
>>>
>>> I see. It is just that downstream behaves that way, hence we sell
>>> modules which use minimal ECC on ONFI enabled chips and legacy (maximum
>>> ECC which fits into OOB) on modules with non-ONFI chips.
>>
>> And I guess you use the same DT for both variants of the board :-/
>>
> 
> Actually we only have two SKUs, and they differ also otherwise so I have
> two DTs anyway.
> 
>>>
>>> Currently we operate the above Macronix chip with 8-bit ECC since quite
>>> a while.
>>
>> Honestly, I don't see a good solution here except adding an extra DT or
>> live-patching it from the bootloader, because, even if this hack works
>> for you know, it might not in the future.
> 
> Extra DT is fine for Linux.
> 
> The problem is more with U-Boot, where I tried to add minimal ECC
> support via Kconfig symbol and align with Linux behavior. For U-Boot I
> would really prefer to have a single binary for all SKUs...
> 
> I already sent a first patchset
> https://patchwork.ozlabs.org/patch/867180/
> 
> I guess it should be somehow possible to do a board specific selection
> of ECC. But this is a discussion for another thread.
> 
>>
>> In the future, if you plan to have boards with different variants of
>> NANDs, I recommend that you always maximize ECC, this way you won't
>> have this kind of issues.
> 
> Makes sense. Unfortunately, for those products we already ship, changing
> would be rather painful.
> 
>>
>>>
>>> > So, if what you really want is legacy_set_geometry(), don't specify
>>> > "fsl,use-minimum-ecc" in your DT and you should be good. Otherwise, fix
>>> > the Macronix driver to initialize ->ecc_{strength,step_size}_ds
>>> > appropriately.
>>>
>>> The datasheet says:
>>> • High Reliability
>>> - Endurance: 100K cycles (with 1-bit ECC per 528-byte)
>>>
>>> So we would set ecc_strenght to 1?
>>
>> If the datasheet says so, then yes, you should have
>> ->ecc_strength_ds = 1 and ->ecc_step_size_ds = 512.
>>
>>> But then there is almost no room for
>>> wear leveling. I remember that I dumped the fixed bits once on such a
>>> chip, and there were several blocks from factory which needed one bit
>>> fixed...
>>
>> Well, that's a different issue. You might want to maximize the ECC
>> strength for your specific board. In this case, you should not specify
>> "fsl,use-minimum-ecc" in your DT, or, if the driver supports it (but I
>> doubt it does), you should add "nand-ecc-maximize". Alternatively, if
>> you want to keep some of the OOB space, you can ask for a specific ECC
>> config with the "nand-ecc-strength" and "nand-ecc-step-size" properties.
> 
> Different issue, but in the end all I care about: Does wear leveling
> work properly.
> 
> The NAND chip documentation also mentions that typical access is per
> page (2K), I guess if one uses a single ECC across the complete page
> then 4-bits are available, which should allow a somewhat decent wear
> leveling.
> 
> I guess we can go with nand-ecc-strength/nand-ecc-step-size for that
> chip for now.

This seems not to be the case for the driver in question gpmi_nand_init
calls:
nand_scan_ident -> nand_dt_init (which fills
chip->ecc.strength/chip->ecc.size)

then

gpmi_init_last -> gpmi_set_geometry -> bch_set_geometry ->
legacy_set_geometry/set_geometry_by_ecc_info

In both cases struct bch_geometry is calculated and overwrites
ecc.strength/ecc.size (without considering either of them,
set_geometry_by_ecc_info is considering ecc_strength_ds/ecc_step_ds
though).

I guess we would have to add a third option in case device tree
specifies strength/size, and validate whether it can be reasonably
fulfilled?

E.g. extend common_nfc_set_geometry:


 int common_nfc_set_geometry(struct gpmi_nand_data *this)
 {
+	struct nand_chip *chip = &this->nand;
+
+	if (chip->ecc.strength set && chip->ecc.strength set)
+		return set_geometry_by_ecc_dt_info(this);
+
 	if ((of_property_read_bool(this->dev->of_node, "fsl,use-minimum-ecc"))
 				|| legacy_set_geometry(this))
 		return set_geometry_by_ecc_info(this);
 
 	return 0;
 }

--
Stefan


> 
> However, in Linux we should at least fix the device tree bindings
> documentation for "fsl,use-minimum-ecc" then.
> 
> --
> Stefan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ