linux-kernel - Re: [PATCH 07/12] xfs: don't preallocate a transaction for file size updates

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190627222324.GH7777@dread.disaster.area>
Date:   Fri, 28 Jun 2019 08:23:24 +1000
From:   Dave Chinner <david@...morbit.com>
To:     Christoph Hellwig <hch@....de>
Cc:     "Darrick J. Wong" <darrick.wong@...cle.com>,
        Damien Le Moal <Damien.LeMoal@....com>,
        Andreas Gruenbacher <agruenba@...hat.com>,
        linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 07/12] xfs: don't preallocate a transaction for file size
 updates

On Tue, Jun 25, 2019 at 12:25:07PM +0200, Christoph Hellwig wrote:
> On Tue, Jun 25, 2019 at 09:15:23AM +1000, Dave Chinner wrote:
> > > So, uh, how much of a hit do we take for having to allocate a
> > > transaction for a file size extension?  Particularly since we can
> > > combine those things now?
> > 
> > Unless we are out of log space, the transaction allocation and free
> > should be largely uncontended and so it's just a small amount of CPU
> > usage. i.e it's a slab allocation/free and then lockless space
> > reservation/free. If we are out of log space, then we sleep waiting
> > for space - the issue really comes down to where it is better to
> > sleep in that case....
> 
> I see the general point, but we'll still have the same issue with
> unwritten extent conversion and cow completions, and I don't remember
> seeing any issue in that regard.

These are realtively rare for small file workloads - I'm really
talking about the effect of delalloc and how we've optimised
allocation during writeback to merge small, cross-file writeback
into much larger large physical IOs. Unwritten extents nor COW are
used in these (common) cases, and if they are then the allocation
patterns prevent the cross-file IO merging in the block layer and so
we don't get the "hundred ioends for a hundred inodes from a single
a physical IO completion" thundering heard problem....

> And we'd hit exactly that case
> with random writes to preallocated or COW files, i.e. the typical image
> file workload.

I do see a noticable amount of IO completion overhead in the host
when hitting unwritten extents in VM image workloads. I'll see if I
can track the number of kworkers we're stalling in under some of
these workloads, but I think it's still largely bound by the request
queue depth of the IO stack inside the VM because there is no IO
merging in these cases.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com