linux-kernel - Re: [PATCH v3] mtd: Verify written data in paranoid mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87r00ugcat.fsf@bootlin.com>
Date: Mon, 12 May 2025 15:59:22 +0200
From: Miquel Raynal <miquel.raynal@...tlin.com>
To: Csókás Bence <csokas.bence@...lan.hu>
Cc: Richard Weinberger <richard@....at>,  linux-mtd
 <linux-mtd@...ts.infradead.org>,  linux-kernel
 <linux-kernel@...r.kernel.org>,  Vignesh Raghavendra <vigneshr@...com>
Subject: Re: [PATCH v3] mtd: Verify written data in paranoid mode

Hello,

On 12/05/2025 at 15:13:20 +02, Csókás Bence <csokas.bence@...lan.hu> wrote:

> Hi,
>
> On 2025. 05. 12. 14:47, Richard Weinberger wrote:
>> ----- Ursprüngliche Mail -----
>>> Von: "Csókás Bence" <csokas.bence@...lan.hu>
>>> Well, yes, in our case. But the point is, we have a strict requirement
>>> for data integrity, which is not unique to us I believe. I would think
>>> there are other industrial control applications like ours, which dictate
>>> a high data integrity.
>> In your last patch set you said your hardware has an issue that every
>> now and that data is not properly written.
>> Now you talk about data integrity requirements. I'm confused.
>
> The two problems are not too dissimilar: in one case you have a random,
> and _very_ low chance of data corruption, e.g. because of noise, aging
> hardware, power supply ripple etc. But you still need to make sure that
> the written data is absolutely correct; or if it is not, the system will
> immediately enter some fail-safe mode. This is the problem we want to
> solve, for everybody using Linux in high reliability environments.
>
> The problem we _have_ though happens to be a bit different: here we are
> blursed with a system that corrupts data at a noticeable
> probability. But the model is the same: a stochastic process introducing
> bit errors on write. But I sincerely hope no one else has this problem,
> and this is *not* the primary aim of this patch; it just happens to
> solve our issue as well. But I intend it to be useful for the larger
> Linux community, thus the primary goal is to solve the first issue.

I don't have a strong opinion there but I don't dislike this idea
because it might also help troubleshooting errors sometimes. It is very
hard to understand issues which happen to be discovered way after they
have been generated (typically during a read, way later than a "faulty"
write). Having this paranoid option would give a more synchronous
approach which is easier to work with sometimes.

Cheers,
Miquèl