linux-kernel - Re: [PATCH 0/3] mm: Swap checksum

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 24 May 2010 07:50:31 -0300
From:	Cesar Eduardo Barros <cesarb@...arb.net>
To:	Minchan Kim <minchan.kim@...il.com>
CC:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH 0/3] mm: Swap checksum

Em 23-05-2010 23:05, Minchan Kim escreveu:
> On Mon, May 24, 2010 at 9:57 AM, Cesar Eduardo Barros<cesarb@...arb.net>  wrote:
>> The internal ECC of the disk will not save you - a quick Google search found
>> an instance of someone with silent data corruption caused by a faulty *power
>> supply*.[1]
>>
>> And if it is silent corruption, without the checksums you will not notice it
>> - it will just be dismissed as "oh, Firefox just crashed again" or similar
>> (the same as bit flips on RAM without ECC).
>
> When I read your comment, suddenly some thought occurred to me.
> If we can't believe ECC of the disk, why do we separate error
> detection logic between file system and swap disk?
>
> I mean it make sense that put crc detection into block layer?
> It can make sure any block I/O.

There are differences as to where the checksum will be stored.

For the filesystem (and for the software suspend image), you have to 
store the checksums in the disk itself, and the correct place (and 
ordering requirements) depends on the filesystem. Also, most filesystems 
do not currently have on-disk checksums.

For the swap, it is much simpler; the checksums can be stored in memory 
(since they do not matter after a reboot; the swap contents are simply 
discarded). This also gives better performance, since the checksums do 
not have to be separately written, and more flexibility, since the 
kernel can use whatever kind of checksum it wants and can store the 
checksum in whatever data structure it choses, without worrying about 
compatibility.

And, in fact, there is a CRC code in the block layer; it is 
CONFIG_BLK_DEV_INTEGRITY. However, it is not a generic solution; it 
needs some extra prerequisites (like a disk low-level formatted with 
sectors with >512 bytes).

> And what's BER of disk?
> Is it usual to meet the problem?

It is unusual enough that most people who meet it will not notice.

And the filesystem developers (who understand about these things more 
than me) seem to be trending towards adding checksums to their 
filesystems. The point of this patch is to meet the same level of safety 
as btrfs (thus the choice of crc32c, which is what btrfs uses).

> In normal desktop, some app killed are not critical. If the
> application is critical, maybe app have to logic fault handling.
> Firefox has session restore feature and Office program has temporal
> save feature.

In fact, crashing is the "best" outcome here. The worst outcome is your 
application silently corrupting the data you then save to disk.

> On the other hand, in server, does it designed well to use swap disk
> until we meet bit error of disk?

Servers are the ones which would benefit the most, as their RAM is 
usually very reliable (they tend to use ECC memory). Their disk 
subsystems, however, are also more reliable.

Desktop systems (especially "no-name" brand ones) do not gain as much, 
since their RAM is usually unprotected; however, they are also the ones 
which have better chance of having low-quality power supplies, uncommon 
storage media (USB flash drives as the root disk is an example), and 
problematic I/O subsystems.

> My feel is that it seem to be rather overkill.

Yes, it is a bit overkill, except when you are using software suspend. 
While the software suspend image is not protected by this patch (I am 
already thinking of a separate patch to add checksums to it), the 
swapped out pages are (software suspend uses both a memory image saved 
to the swap partition and the normal swapped out pages).

But you do not have to use it if you think it is overkill - I even added 
a kernel parameter to easily disable it.

>> The swap checksum only protects the page against being silently corrupted
>> while on the disk and at least to some degree on the I/O path between the
>> memory and the disk. It does not protect against broken kernel-mode code
>> writing to the wrong address, nor against broken hardware (or hardware
>> misconfigured by broken drivers) doing DMA to wrong addresses. It also does
>> not protect against hardware errors in the RAM itself (you have ECC memory
>> for that).
>>
>> That is, the code assumes the memory containing the checksums will not be
>> corrupted, because if it is, you have worse problems (and the CRC error here
>> would be a *good* thing, since it would make you notice something is not
>> quite right).
>>
>
> Which is high between BER of RAM and disk?
> It's a just question. :)

I have no idea.

However, we can do nothing in software against RAM errors; it would kill 
performance too much. Against disk errors, however, we can do a lot 
(software RAID-1 is just one of the simplest examples).

-- 
Cesar Eduardo Barros
cesarb@...arb.net
cesar.barros@...il.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/