lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFwf3iOwyDDxjKbJ0fs=fy+-8mkoFHdgsEq32nAkerXS4g@mail.gmail.com>
Date:	Wed, 28 Sep 2011 08:47:15 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Arnaud Lacombe <lacombar@...il.com>
Cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 3.1-rc7

On Tue, Sep 27, 2011 at 10:34 PM, Arnaud Lacombe <lacombar@...il.com> wrote:
>
> <off-topic>
> Speaking of corruption, I'm encountering another set on an external
> hard-drive, connected through USB.

I don't think it's unrelated or off-topic.

>     The same corruption pop up (at least in those text file): a
> sequence of 4 bytes is replaced by 0x000000E0 at offset 0x1E4 of the
> start of the file for some of them, 0x3E4 for two other (same
> corruption though). Locating the corruption will be more tricky in
> binary files.

So it's possible that it's some rogue kernel pointer. We've certainly
had those before. Constants offsets like that happen with some
structure allocation that just happens to be say 1kB in size, and the
rogue kernel pointer assigns at a fixed offset to something that has
already been free'd.

You might want to try to compile the kernel with SLUB_DEBUG_ON set,
and possibly also DEBUG_PAGEALLOC.

HOWEVER. It's quite possible that it's hardware too.

> I may not trust the drive, but the fact that only known offset are
> corrupted (in text files), the exact same way, sounds too much of a
> coincidence. Anyway, I started a long SMART self test to see if it
> catches anything, as there was no DMA transfer error[0].

It *could* be the disk, but it's much more likely to be something like
memory or a bad cable. Which wouldn't show up with SMART, since that
just tests internal disk issues.

Do you get some occasional random SIGSEGV's too?

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ