lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230712154142.3c49d782@xps-13>
Date:   Wed, 12 Jul 2023 15:57:40 +0200
From:   Miquel Raynal <miquel.raynal@...tlin.com>
To:     Johan Jonker <jbx6244@...il.com>
Cc:     richard@....at, vigneshr@...com, heiko@...ech.de,
        linux-mtd@...ts.infradead.org, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        linux-rockchip@...ts.infradead.org, yifeng.zhao@...k-chips.com
Subject: Re: [PATCH v3 3/3] mtd: rawnand: rockchip-nand-controller: add
 skipbbt option

Hi Johan,

Richard, a question for you below :-)

jbx6244@...il.com wrote on Fri, 7 Jul 2023 17:27:56 +0200:

> Hi Miquel,
> 
> Some comments/questions below,

There are a lot, I've selected the ones which appear the most relevant
to me right now.

> On 7/4/23 17:13, Miquel Raynal wrote:
> > Hi Johan,
> > 
> > jbx6244@...il.com wrote on Thu, 15 Jun 2023 19:34:26 +0200:
> >   
> >> On Rockchip SoCs the first boot stages are written on NAND  
> 
> >> with help of manufacturer software that uses a different format
> >> then the MTD framework. Skip the automatic BBT scan with the  
> 
> Not only about the boot blocks.
> The rest of the Nand is formatted as well by closed source.
> It formats the location at the new BBT pages as well.
> 
> >> NAND_SKIP_BBTSCAN option so that the original content is unchanged
> >> during the driver probe.
> >> The NAND_NO_BBM_QUIRK option allows us to erase bad blocks with
> >> the nand_erase_nand() function and the flash_erase command.
> >> With these options the user has the "freedom of choice" by neutral
> >> access mode to read and write in whatever format is needed.  
> >   
> 
> > Can the boot_rom_mode thing help?  
> 
> This boot_rom_mode thing is for switching ECC mode in boot block pages and is not related to BBT pages.
> 
> > 
> > For writing this part you can disable the bad block protection using
> > debugfs and then write an externally processed file in raw mode I guess.  
> 
> The use of debugfs only might make sense when the driver can pass the probe function without errors.
> It does however not save guard existing data when it creates and writes to a new BBT page location.
> When the data at the new BBT pages previously is written with a different ECC or data/OOB format the hardware driver probe fails.
> 
> Your suggestion does not work for the Rockchip case.
> Please advise what to do with this patch.
> 
> ===
> 
> [ 2148.026403] calling  rk_nfc_driver_init+0x0/0x1000 [rockchip_nand_controller] @ 2062
> [ 2148.039156] rockchip-nfc 10500000.nand-controller: timing 11e1
> [ 2148.056891] nand: device found, Manufacturer ID: 0xad, Chip ID: 0xde
> [ 2148.067700] nand: Hynix H27UCG8T2ATR-BC 64G 3.3V 8-bit
> [ 2148.076717] nand: 8192 MiB, MLC, erase size: 2048 KiB, page size: 8192, OOB size: 640
> [ 2148.089022] rockchip-nfc 10500000.nand-controller: hynix_nand_init
> [ 2148.099317] rockchip-nfc 10500000.nand-controller: h27ucg8t2atrbc_choose_interface_config
> [ 2148.111779] rockchip-nfc 10500000.nand-controller: timing 11e1
> [ 2148.129836] rockchip-nfc 10500000.nand-controller: rd hw page: 1048575
> [ 2148.149364] rockchip-nfc 10500000.nand-controller: read page: fffff ecc error!
> [ 2148.160742] rockchip-nfc 10500000.nand-controller: rd hw page: 1048319
> [ 2148.180164] rockchip-nfc 10500000.nand-controller: read page: ffeff ecc error!
> [ 2148.191413] rockchip-nfc 10500000.nand-controller: rd hw page: 1048063
> 
> [..]
> 
> Check for existing BBT pages.
> 
> [ 2148.339836] rockchip-nfc 10500000.nand-controller: read page: ffdff ecc error!
> [ 2148.350864] rockchip-nfc 10500000.nand-controller: rd hw page: 1047807
> [ 2148.369835] rockchip-nfc 10500000.nand-controller: read page: ffcff ecc error!
> [ 2148.380597] Bad block table not found for chip 0
> [ 2148.388479] Scanning device for bad blocks
> [ 2148.395591] rockchip-nfc 10500000.nand-controller: rd hw page: 255
> [ 2148.414087] rockchip-nfc 10500000.nand-controller: read page: max_bitflips: 0
> 
> [..]
> 
> Check all pages to create a new BBT.
> 
> [ 2258.693279] rockchip-nfc 10500000.nand-controller: rd hw page: 1030143
> [ 2258.710970] rockchip-nfc 10500000.nand-controller: read page: max_bitflips: 0
> [ 2258.720804] rockchip-nfc 10500000.nand-controller: rd hw page: 1030399
> [ 2258.738394] rockchip-nfc 10500000.nand-controller: read page: fb8ff ecc error!
> [ 2258.748229] Bad eraseblock 4024 at 0x0001f7000000
> 
> [..]
> 
> All BBT pages are marked bad, so it thinks there's no space left
> 
> [ 2261.403708] rockchip-nfc 10500000.nand-controller: rd hw page: 1048575
> [ 2261.422750] rockchip-nfc 10500000.nand-controller: read page: fffff ecc error!
> [ 2261.433634] Bad eraseblock 4095 at 0x0001ffe00000
> 
> [ 2261.441723] No space left to write bad block table
> 
> This is not done.
> A Nand disk should not be altered if it's formatted in a different format unless an explicit command is given.
> 
> [ 2261.449860] nand_bbt: error while writing bad block table -28
> [ 2261.467156] rockchip-nfc 10500000.nand-controller: failed to init NAND chips
> [ 2261.477917] rockchip-nfc: probe of 10500000.nand-controller failed with error -28
> [ 2261.497011] probe of 10500000.nand-controller returned 28 after 113456649 usecs
> [ 2261.508273] initcall rk_nfc_driver_init+0x0/0x1000 [rockchip_nand_controller] returned 0 after 113468040 usecs
> 
> Probe failed, but there nothing wrong with hardware.
> MTDx partitions are not created.
> User space commands that rely strings like "/dev/mtd0" don't work.
> 
> ===
> 
> Application
> Kernel
> Drivers
> Hardware
> 
> Why do we fail a hardware driver probe when the level above doesn't deal with all real world situations.
> Shouldn't that be more separated in MTD levels.
> 
> ===
> 
> > 
> > For the boot I think I proposed a DT property. I don't remember how far
> > the discussion went.  
> 
> Is there a web link of that discussion?

https://lore.kernel.org/linux-mtd/20230609104426.3901df54@xps-13/

Maybe the term "DT property" did not appear but that's what I had in
mind :-) I don't know if it must be a chip or a partition property.

Richard, here is where I would like your feedback, I am kind of opposed
to a "skip_bbt" module parameter. It's not a strong opinion, if you
think it's fine I'm open.

I would rather prefer a DT property which says "do not use the
standard pattern".

Johan, how are bad blocks supposed to be tracked if the bad block
tables are written in a way which does not allow Linux to read it?

> How do I link 'compatible' to '/dev/mtd0' for example.
> In a '/proc/mtd' entry?
> See /drivers/mtd/spi-nor/debugfs.c
> 
> 
> In order to write boot blocks in a pattern it needs to know the chip ID as used in nand_flash_ids[].
> How can we export this ID to userspace?

There was an attempt a few days back, it's not upstream, I can't find
it back ATM. I'll send it once I find it back.

> Also in a '/proc/mtd' entry?
> 
> ===
> 
> 		partitions {
> 			compatible = "fixed-partitions";
> 			#address-cells = <2>;
> 			#size-cells = <2>;
> 
> 			partition@0 {
> 				reg = <0x0 0x0 0x0 0x02000000>;
> 				compatible = "rockchip,boot-blk";
> 				label = "boot-blk";
> 			};
> 
> 			partition@...0000 {
> 				reg = <0x0 0x02000000 0x1 0xfa000000>;
> 				label = "rootfs";
> 			};
> 
> 			partition@...000000 {
> 				reg = <0x1 0xfc000000 0x0 0x04000000>;
> 				compatible = "mtd,bbt"
> 				label = "bbt";

This does not work, it would take me hours to explain why, it's all
about:
- stability
- how mtd has been implemented

> 			};
> 		};
> 
> How many erase blocks is MTD using for BBT pages? 4 or 8 or ?

As many blocks are needed to cover the size of the device, times 2.

> BBT pages are not clearly defined in the DT.

No, because this is software, not hardware. MTD is a Linux thing, which
gives you access to an mtd device through a number of operations. It
handles bad blocks.

> They are somehow marked bad in the overlapping partition.

Not sure what "overlapping partition" means.

> How are suppose to erase BBT pages in a automated way?
> Is there a need for a new BBT compatible?

No, Linux has run with this standard pattern for 20 years, downstream
kernels sometimes make silly design decisions, we do not want to
support them.

> What is your idea about compatibles suggestion:
> rockchip,boot-blk
> mtd,bbt
> 
> ===
> 
> Johan
> 
> >   
> >>
> >> Signed-off-by: Johan Jonker <jbx6244@...il.com>
> >> ---
> >>
> >> Changes V3:
> >>   Change prefixes
> >>
> >> Changed V2:
> >>   reword
> >>
> >> ---
> >>
> >> I'm aware that the maintainer finds it "awful",
> >> but it's absolute necessary to:
> >> 1: read/write boot blocks in user space without touching original content
> >> 2: format a NAND for MTD either with built in or external driver module
> >>
> >> So we keep it include in this serie.
> >> ---
> >>  drivers/mtd/nand/raw/rockchip-nand-controller.c | 7 +++++++
> >>  1 file changed, 7 insertions(+)
> >>
> >> diff --git a/drivers/mtd/nand/raw/rockchip-nand-controller.c b/drivers/mtd/nand/raw/rockchip-nand-controller.c
> >> index 5a0468034..fcda4c760 100644
> >> --- a/drivers/mtd/nand/raw/rockchip-nand-controller.c
> >> +++ b/drivers/mtd/nand/raw/rockchip-nand-controller.c
> >> @@ -188,6 +188,10 @@ struct rk_nfc {
> >>  	unsigned long assigned_cs;
> >>  };
> >>
> >> +static int skipbbt;
> >> +module_param(skipbbt, int, 0644);
> >> +MODULE_PARM_DESC(skipbbt, "Skip BBT scan if data on the NAND chip is not in MTD format.");
> >> +
> >>  static inline struct rk_nfc_nand_chip *rk_nfc_to_rknand(struct nand_chip *chip)
> >>  {
> >>  	return container_of(chip, struct rk_nfc_nand_chip, chip);
> >> @@ -1153,6 +1157,9 @@ static int rk_nfc_nand_chip_init(struct device *dev, struct rk_nfc *nfc,
> >>
> >>  	nand_set_controller_data(chip, nfc);
> >>
> >> +	if (skipbbt)
> >> +		chip->options |= NAND_SKIP_BBTSCAN | NAND_NO_BBM_QUIRK;
> >> +
> >>  	chip->options |= NAND_USES_DMA | NAND_NO_SUBPAGE_WRITE;
> >>  	chip->bbt_options = NAND_BBT_USE_FLASH | NAND_BBT_NO_OOB;
> >>
> >> --
> >> 2.30.2
> >>  
> > 
> > 
> > Thanks,
> > Miquèl  


Thanks,
Miquèl

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ