[<prev] [next>] [day] [month] [year] [list]
Message-ID: <a5010dcf-a8ce-f144-949c-687548cefce7@amlogic.com>
Date: Wed, 26 Apr 2023 21:51:33 +0800
From: Liang Yang <liang.yang@...ogic.com>
To: Arseniy Krasnov <avkrasnov@...rdevices.ru>,
Miquel Raynal <miquel.raynal@...tlin.com>
Cc: Richard Weinberger <richard@....at>,
Vignesh Raghavendra <vigneshr@...com>,
Neil Armstrong <neil.armstrong@...aro.org>,
Kevin Hilman <khilman@...libre.com>,
Jerome Brunet <jbrunet@...libre.com>,
Martin Blumenstingl <martin.blumenstingl@...glemail.com>,
Jianxin Pan <jianxin.pan@...ogic.com>,
Yixun Lan <yixun.lan@...ogic.com>, oxffffaa@...il.com,
kernel@...rdevices.ru, linux-mtd@...ts.infradead.org,
linux-arm-kernel@...ts.infradead.org,
linux-amlogic@...ts.infradead.org, linux-kernel@...r.kernel.org,
"yonghui.yu" <yonghui.yu@...ogic.com>
Subject: Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read
Hi Arseniy,
On 2023/4/20 17:37, Arseniy Krasnov wrote:
> [ EXTERNAL EMAIL ]
>
> On 19.04.2023 09:41, Arseniy Krasnov wrote:
>>
>>
>> On 19.04.2023 06:05, Liang Yang wrote:
>>> Hi Arseniy,
>>>
>>> On 2023/4/18 22:57, Arseniy Krasnov wrote:
>>>> [ EXTERNAL EMAIL ]
>>>>
>>>>
>>>>
>>>> On 18.04.2023 16:25, Miquel Raynal wrote:
>>>>> Hi Arseniy,
>>>>>
>>>>>>>> Hello again @Liang @Miquel!
>>>>>>>>
>>>>>>>> One more question about OOB access, as I can see current driver uses the following
>>>>>>>> callbacks:
>>>>>>>>
>>>>>>>> nand->ecc.write_oob_raw = nand_write_oob_std;
>>>>>>>> nand->ecc.write_oob = nand_write_oob_std;
>>>>>>>>
>>>>>>>>
>>>>>>>> Function 'nand_write_oob_std()' writes data to the end of the page. But as I
>>>>>>>> can see by dumping 'data_buf' during read, physical layout of each page is the
>>>>>>>> following (1KB ECC):
>>>>>>>>
>>>>>>>> 0x000: [ 1 KB of data ]
>>>>>>>> 0x400: [ 2B user data] [ 14B ECC code]
>>>>>>>> 0x410: [ 1 KB of data ] (A)
>>>>>>>> 0x810: [ 2B user data] [ 14B ECC code]
>>>>>>>> 0x820: [ 32B unused ]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> So, after 'nand_write_oob_std()' (let data be sequence from [0x0 ... 0x3f]),
>>>>>>>> page will look like this:
>>>>>>>>
>>>>>>>> 0x000: [ 0xFF ]
>>>>>>>> 0x400: [ ........ ]
>>>>>>>> 0x7f0: [ 0xFF ]
>>>>>>>> 0x800: [ 00 ....................... ]
>>>>>>>> 0x830: [ ........................ 3f ]
>>>>>>>>
>>>>>>>> Here we have two problems:
>>>>>>>> 1) Attempt to display raw data by 'nanddump' utility produces a little bit
>>>>>>>> invalid output, as driver relies on layout (A) from above. E.g. OOB data
>>>>>>>> is at 0x400 and 0x810. Here is an example (attempt to write 0x11 0x22 0x33 0x44):
>>>>>>>>
>>>>>>>> 0x000007f0: 11 22 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |."..............|
>>>>>>>> OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
>>>>>>>> OOB Data: 33 44 ff ff ff ff ff ff ff ff ff ff ff ff ff ff |3D..............|
>>>>>>>> OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
>>>>>>>> OOB Data: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
>>>>>>>>
>>>>>>> Hi Arseniy,
>>>>>>>
>>>>>>> I realized the write_oob_raw() and write_oob() are wrong in meson_nand.c. I suggest both of them should be reworked and follow the format of meson nand controller. i.e. firstly format the data in Layout (A) and then write. reading is firstly reading the data of layout (A) and then compost the layout (B).
>>>>>>
>>>>>> IIUC after such writing only OOB (e.g. user bytes) according layout (A), hw will also write ECC codes, so
>>>>>> it will be impossible to write data to this page later, because we cannot update ECC codes properly for the newly
>>>>>> written data (we can't update bits from 0 to 1).
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> 2) Attempt to read data in ECC mode will fail, because IIUC page is in dirty
>>>>>>>> state (I mean was written at least once) and NAND controller tries to use
>>>>>>>> ECC codes at 0x400 and 0x810, which are obviously broken in this case. Thus
>>>>>>>
>>>>>>> As i said above, write_oob_raw() and write_oob() should be reworked.
>>>>>>> i don't know what do you mean page was written at least once. anyway the page should be written once, even just write_oob_raw().
>>>>>>
>>>>>> Sorry, You mean that after OOB write, we cannot write to the data area (e.g. 0x0 .. 0x810) until page will be erased? For example
>>>>>> JFFS2 writes to OOB own markers, then it tries to write to the data area of such page.
>>>>
>>>> @Liang, I'll describe current test case in details:
>>>> 1) I have erased page, I can read it in both raw and ecc modes - no problem (it is full of 0xFF).
>>>> 2) I (JFFS2 for example) want to write only OOB - let it be clean markers.
>>>> 3) I use raw write to the needed page (please correct me if i'm wrong). Four bytes
>>>> at 0x400 and 0x810 are updated. All other bytes still 0xff.
>>>> 4) Now, when i'm trying to read this page in ECC mode, I get ECC errors: IIUC this
>>>> happens because from controller point of view ECC codes are invalid for current
>>>> data (all ECCs are 0xff). Is this behaviour is ok?
>>>
>>> Yes, it is exactly reported ECC errors.
>>
>> I see, so if we write OOB (e.g. using raw mode), there is no way to read this page in ECC mode later? And the
Of course, there is no ECC parity bytes in it; or raw write the data
with the ECC parity bytes per the layout (A) you describe above.
>> only way to make it readable is to write it in ECC mode, but before this write, we need to read it's
>> user's byte (from previous OOB write) in raw mode, put it to info buf (as user's bytes) and write this page. In this
>> case NAND controller will generate ECC codes including user's byte and page become readable in ECC mode
>> again.
yes, you are right.
>>
>>>
>>>> 5) Ok, don't care on these ECC errors, let's go further.
>>>> 6) I'm going to write same page in ECC mode - how to do it correctly? There is already
>>>> 4 OOB bytes, considered to be covered by ECC (but in fact now - ECC area is FFed).
>>>
>>> If step 4 has excuted "program" command at the page (nand_write_oob_std() does), it can't be written again before erasing the page(block). so we have to read the whole page in the ddr and change the content, erase block, write it again.
>>>
>>> I don't think Jffs2 has the same steps (1-6) as you said above. are you sure that happes on Jffs2 or just an example?
>
>
>>
>> I just checked JFFS2 mount/umount again, here is what i see:
>> 0) First attempt to mount JFFS2.
>> 1) It writes OOB to page N (i'm using raw write). It is cleanmarker value 0x85 0x19 0x03 0x20. Mount is done.
>> 2) Umount JFFS2. Done.
>> 3) Second attempt to mount JFFS2.
>> 4) It reads OOB from page N (i'm using raw read). Value is 0x85 0x19 0x03 0x20. Done.
>> 5) It reads page N in ECC mode, and i get:
>> jffs2: mtd->read(0x100 bytes from N) returned ECC error
>> 6) Mount failed.
>>
>> We already had problem which looks like this on another device. Solution was to use OOB area which is
>> not covered by ECC for JFFS2 cleanmarkers.
ok, so there is not ECC parity bytes and mtd->read() returns ECC error.
does it have to use raw write/read on step 1) and 4)?
>>
>> Thanks, Arseniy
>>
>
> @Liang,
>
> Small addition, if i'm trying to implement OOB read/write in ECC mode, then step 5) will success,
> but later, JFFS2 tries to write this page (in ECC mode of course), and in this case ECC codes will
> be broken, because we can't update them properly without erasing whole page.
>
> Please take a look at this patch from my colleagues:
> https://lore.kernel.org/all/20230329114240.378722-1-mmkurbanov@sberdevices.ru
> It is related with "We already had problem which looks like this on another device" from above:
> in 'f50l1g41lb_ooblayout_free()' we reserve 2 bytes in non-ECC area for bad block markers.
It is about the SPI NAND and use another controller called spifc.
but in meson nfc, it is heavily depended on the pre-created BBT in NAND
device.
>
> Thanks, Arseniy
>
>
>>>
>>>>
>>>> That's why I asked Your opinion about moving OOB data to nonprotected by ECC area (and
>>>> leave user's bytes untouched). In this case OOB access is free and not linked with ECC
>>>> codes which also covers data.
>>>>
>>>> Thanks, Arseniy
>>>>
>>>>>
>>>>> A page is written after two steps:
>>>>> - loading the data into the NAND chip cache (that's when you use the
>>>>> bus)
>>>>> - programming the NAND array with the data loaded in cache (that's when
>>>>> you wait)
>>>>>
>>>>> In theory it does not matter where you write in the cache, it's regular
>>>>> DRAM, you can make random writes there with the appropriate NAND
>>>>> commands. Of course when using embedded hardware ECC engines, the
>>>>> controllers usually expect to be fed in a certain way in order to
>>>>> produce the ECC bytes and put them at the right location in cache.
>>>>>
>>>>> And then, when you actually send the "program" command, the NAND cells
>>>>> actually get programmed based on what has been loaded in cache.
>>>>
>>>> Thanks for this details! Very interesting!
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Miquèl
>>>>
Powered by blists - more mailing lists