linux-kernel - Re: bio linked list corruption.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161027172332.krdsrc5rlivq4mrv@codemonkey.org.uk>
Date:   Thu, 27 Oct 2016 13:23:32 -0400
From:   Dave Jones <davej@...emonkey.org.uk>
To:     Dave Chinner <david@...morbit.com>
Cc:     Chris Mason <clm@...com>, Andy Lutomirski <luto@...capital.net>,
        Andy Lutomirski <luto@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Jens Axboe <axboe@...com>, Al Viro <viro@...iv.linux.org.uk>,
        Josef Bacik <jbacik@...com>, David Sterba <dsterba@...e.com>,
        linux-btrfs <linux-btrfs@...r.kernel.org>,
        Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: bio linked list corruption.

On Thu, Oct 27, 2016 at 04:41:33PM +1100, Dave Chinner wrote:
 
 > And that's indicative of a delalloc metadata reservation being
 > being too small and so we're allocating unreserved blocks.
 > 
 > Different symptoms, same underlying cause, I think.
 > 
 > I see the latter assert from time to time in my testing, but it's
 > not common (maybe once a month) and I've never been able to track it
 > down.  However, it doesn't affect production systems unless they hit
 > ENOSPC hard enough that it causes the critical reserve pool to be
 > exhausted iand so the allocation fails. That's extremely rare -
 > usually takes a several hundred processes all trying to write as had
 > as they can concurrently and to all slip through the ENOSPC
 > detection without the correct metadata reservations and all require
 > multiple metadata blocks to be allocated durign writeback...
 > 
 > If you've got a way to trigger it quickly and reliably, that would
 > be helpful...

Seems pretty quickly reproducable for me in some shape or form.
Run trinity with --enable-fds=testfile and create enough children
to create a fair bit of contention, (for me -C64 seems a good fit on
spinning rust, but if you're running on shiny nvme you might have to pump it up a bit).

	Dave