lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50A2C60B.1040405@redhat.com>
Date:	Tue, 13 Nov 2012 16:13:31 -0600
From:	Eric Sandeen <sandeen@...hat.com>
To:	linux-ext4@...r.kernel.org
Subject: Re: Weird filesystem corruption from wayland / radeon / chromium

On 11/13/12 12:28 PM, Eric Sandeen wrote:
> On 11/2/12 1:55 PM, Tim Landscheidt wrote:

...

>>> What does
>>
>>> # debugfs -R "dump_extents <274258>" /dev/dm-4
>>
>>> show? (or whatever the appropriate device node path is)
>>
>> See attachment.
> 
> Level Entries       Logical          Physical Length Flags
>  0/ 1   1/  2     0 -  3665 1114157             3666
>  1/ 1   1/ 59     0 -   132  510721 -  510853    133 
>  1/ 1   2/ 59   133 -   139  511415 -  511421      7 
> ...
>  1/ 1  58/ 59  3039 -  3664  573440 -  574065    626 
>  1/ 1  59/ 59  3665 -  4092  574066 -  574493    428 
>  0/ 1   2/  2  3666 -  9217  395702             5552
>  1/ 1   1/307  4093 -  4093  574494 -  574494      1 
>  1/ 1   2/307  4094 -  4095  395758 -  395759      2 
> ...
> 
> Ok, so the first top-level record says it covers logical 0->3665,
> but the last extent actually goes from 3665->4092.
> 
> Then the next top level extent says it covers 3666->9217,
> but that overlaps w/ the last real extent just prior, and
> the first allocated extent under it actually starts at 4093.
> 
> so,
> a) how'd it get into this state, and
> b) why doesn't fsck care ...
> 
> Looking into that . . .

So this is pre-existing corruption somehow' that 2nd 0-level
record's first logical block should match the first 1st-level
extent's logical block under it.  I was hoping you had just
run into some sort of extent tree traversal bug when looking
up this block, but I think you have an actual corruption in the
extent tree already.

You could work around this by just copying the file then renaming
it back, to get a different (presumably correct) extent tree.
But it'll be hard to work out how it got into this state, I don't
yet see how this can happen.  :(

Does your box wind up crashing or losing power, and replaying the
log once?  I'm wondering if it's possible that an extent tree
metadata update got lost in a crash . . .

-Eric

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ