linux-kernel - Re: mkfs.ext2 triggerd RAM corruption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070505013819.GB23803@lanczos.q-leap.de>
Date:	Sat, 5 May 2007 03:38:19 +0200
From:	Bernd Schubert <bs@...eap.de>
To:	linux-kernel@...r.kernel.org
Cc:	bernd-schubert@....de
Subject: Re: mkfs.ext2 triggerd RAM corruption

Jan-Benedict Glaw wrote:

> On Fri, 2007-05-04 16:59:51 +0200, Bernd Schubert <bs@...eap.de>
> wrote:
>> To see whats going on, I copied the entire / (so the initrd) into a
>> tmpfs
>> root, chrooted into it, also bind mounted the main / into this chroot
>> and
>> compared several times /bin of chroot/bin and the bind-mounted /bin
>> while
>> the mkfs.ext2 command was running.
>> 
>> beo-05:/# diff -r /bin /oldroot/bin/
>> beo-05:/# diff -r /bin /oldroot/bin/
>> beo-05:/# diff -r /bin /oldroot/bin/
>> Binary files /bin/sleep and /oldroot/bin/sleep differ
>> beo-05:/# diff -r /bin /oldroot/bin/
>> Binary files /bin/bsd-csh and /oldroot/bin/bsd-csh differ
>> Binary files /bin/cat and /oldroot/bin/cat differ
>> ...
>> 
>> Also tested different schedulers, at least happens with deadline and
>> anticipatory.
>> 
>> The corruption does NOT happen on running the mkfs command on
>> /dev/sda1,
>> but happens with sda2, sda3 and sda3. Also doesn't happen with
>> extended
>> partitions of sda1.
> 
> Is sda2 the largest filesystem out of sda2, sda3 (and the logical
> partitions within the extended sda1, if these get mkfs'ed, too)?

I tested it that way:

- test on sda1, no further partitions
- test on sda2, sda1: ~2MB, everything else for sda2
- test on sda3, sda1: ~2MB, sda2: ~2MB, everything else for sda3
...
test on sda5: sda1: partition that has the extended partition,
everything in
sda5

> 
> I'm not too sure that this is a kernel bug, but probably a bad RAM
> chip. Did you run memtest86 for a while? ...and can you reproduce this
> problem on different machines?

Reproducible on 4 test-systems (2 with identical hardware, but then the
2 + 1 + 1 with entirely different hardware combinations) with ECC memory,
which is monitored by EDAC. Memory, CPU, etc. are already real life stress
tested with several applications, e.g. linpack. 
Though I don't entirely agree, my colleagues in this group are always
telling me, that their real life stress test shows more memory
corruptions than memtest. As soon as I have physical access again, I can also 
do a memtest86 run (would like to do it over the weekend, but don't know how
to convince stupid rembo how to boot memtest).
Anyway, a memory corruption is more than unlikely on these systems for
several reasons.


Thanks,
Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/