lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <411b54cd-c23c-765a-9547-0b43bf422546@fb.com>
Date:   Fri, 21 Oct 2016 16:41:09 -0400
From:   Josef Bacik <jbacik@...com>
To:     Chris Mason <clm@...com>, Dave Jones <davej@...emonkey.org.uk>,
        "Andy Lutomirski" <luto@...capital.net>,
        Andy Lutomirski <luto@...nel.org>,
        "Linus Torvalds" <torvalds@...ux-foundation.org>,
        Jens Axboe <axboe@...com>, Al Viro <viro@...iv.linux.org.uk>,
        David Sterba <dsterba@...e.com>,
        linux-btrfs <linux-btrfs@...r.kernel.org>,
        Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: bio linked list corruption.

On 10/21/2016 04:38 PM, Chris Mason wrote:
>
>
> On 10/21/2016 04:23 PM, Dave Jones wrote:
>> On Fri, Oct 21, 2016 at 04:17:48PM -0400, Chris Mason wrote:
>>
>>  > > BTRFS warning (device sda3): csum failed ino 130654 off 0 csum 2566472073
>> expected csum 3008371513
>>  > > BTRFS warning (device sda3): csum failed ino 131057 off 4096 csum
>> 3563910319 expected csum 738595262
>>  > > BTRFS warning (device sda3): csum failed ino 131176 off 4096 csum
>> 1344477721 expected csum 441864825
>>  > > BTRFS warning (device sda3): csum failed ino 131241 off 245760 csum
>> 3576232181 expected csum 2566472073
>>  > > BTRFS warning (device sda3): csum failed ino 131429 off 0 csum 1494450239
>> expected csum 2646577722
>>  > > BTRFS warning (device sda3): csum failed ino 131471 off 0 csum 3949539320
>> expected csum 3828807800
>>  > > BTRFS warning (device sda3): csum failed ino 131471 off 4096 csum
>> 3475108475 expected csum 2566472073
>>  > > BTRFS warning (device sda3): csum failed ino 131471 off 958464 csum
>> 142982740 expected csum 2566472073
>>  > > BTRFS warning (device sda3): csum failed ino 131471 off 0 csum 3949539320
>> expected csum 3828807800
>>  > > BTRFS warning (device sda3): csum failed ino 131532 off 270336 csum
>> 3138898528 expected csum 2566472073
>>  > > BTRFS warning (device sda3): csum failed ino 131532 off 1249280 csum
>> 2169165042 expected csum 2566472073
>>  > > BTRFS warning (device sda3): csum failed ino 131649 off 16384 csum
>> 2914965650 expected csum 1425742005
>>  > >
>>  > >
>>  > > A curious thing: the expected csum 2566472073 turns up a number of times
>> for different inodes, and gets
>>  > > differing actual csums each time.  I suppose this could be something like
>> a block of all zeros in multiple files,
>>  > > but it struck me as surprising.
>>  > >
>>  > > btrfs people: is there an easy way to map those inodes to a filename ?
>> I'm betting those are the
>>  > > test files that trinity generates. If so, it might point to a race
>> somewhere.
>>  >
>>  > btrfs inspect inode 130654 mntpoint
>>
>> Interesting, they all return
>>
>> ERROR: ino paths ioctl: No such file or directory
>>
>> So these files got deleted perhaps ?
>>
> Yeah, they must have.
>

So one thing that will cause spurious csum errors is if you do things like 
change the memory while it is in flight during O_DIRECT.  Does trinity do that? 
If so then that would explain it.  If not we should probably dig into it.  Thanks,

Josef

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ