linux-kernel - Re: [PATCH] spi-nor: Verify written data in paranoid mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c56c52c0-a824-4ad7-9847-e0e973f811ed@prolan.hu>
Date: Wed, 16 Apr 2025 16:44:59 +0200
From: Csókás Bence <csokas.bence@...lan.hu>
To: Richard Weinberger <richard@....at>
CC: Michael Walle <mwalle@...nel.org>, linux-mtd
	<linux-mtd@...ts.infradead.org>, linux-kernel <linux-kernel@...r.kernel.org>,
	Szentendrei, Tamás <szentendrei.tamas@...lan.hu>, "Tudor
 Ambarus" <tudor.ambarus@...aro.org>, pratyush <pratyush@...nel.org>, "Miquel
 Raynal" <miquel.raynal@...tlin.com>, Vignesh Raghavendra <vigneshr@...com>
Subject: Re: [PATCH] spi-nor: Verify written data in paranoid mode

Hi,

On 2025. 04. 16. 15:09, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
>> Von: "Csókás Bence" <csokas.bence@...lan.hu>
>>>> Add MTD_SPI_NOR_PARANOID config option for verifying all written data to
>>>> prevent silent bit errors to be undetected, at the cost of halving SPI
>>>> bandwidth.
>>>
>>> What is the use case for this? Why is it specific to SPI-NOR
>>> flashes? Or should it rather be an MTD "feature". I'm not sure
>>> whether this is the right way to do it, thus I'd love to hear more
>>> about the background story to this.
>>
>> Well, our case is quite specific, but we wanted to provide a general
>> solution for upstream. In our case we have a component in the data path
>> that can cause a burst bit error, on average after about a hundred
>> megabytes written.
> 
> Hmm. So, there is a serve hardware issue you're working around.
> 
>> We _could_ make it MTD-wide, in our case we only have a NOR Flash
>> onboard so this is where we added it. If it were in the MTD core, where
>> would it make sense?
> 
> I'm not so sure whether it makes sense at all.
> In it's current form, there is no recovery. So anything non-trivial
> on top of the MTD will just see an -EIO and has to give up.
> E.g. a filesystem will remount read-only.

In our case, we use UBIFS on top of UBI, which in this case chooses 
another eraseblock to hold the data instead, then re-tests (erase+write 
cycles) the one which gave -EIO. Since the bus error is only transient, 
it goes away by this time, and thus UBIFS will recover from this cleanly.

So yes, it is up to the FS/upper layers to handle the error. If it can't 
recover from this, then yes, it will give up and enter some 'safe mode' 
(e.g. remount ro). But at least it *does* get notified that there is 
something up, and has a chance to react. Before it just thought 
everything was written with no errors, and then there would be data 
corruption *on the next read*.

> Thanks,
> //richard

Bence