linux-kernel - Re: [RFC][PATCH] cifs: Fix cifs_writepages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2385089.1677258941@warthog.procyon.org.uk>
Date:   Fri, 24 Feb 2023 17:15:41 +0000
From:   David Howells <dhowells@...hat.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     dhowells@...hat.com,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Steve French <stfrench@...rosoft.com>,
        Vishal Moola <vishal.moola@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Jan Kara <jack@...e.cz>, Paulo Alcantara <pc@....nz>,
        Huang Ying <ying.huang@...el.com>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Xin Hao <xhao@...ux.alibaba.com>, linux-mm@...ck.org,
        mm-commits@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH] cifs: Fix cifs_writepages_region()

Matthew Wilcox <willy@...radead.org> wrote:

> Why are you doing it this way?  What's wrong with using
> write_cache_pages() to push all the contiguous dirty folios into a single
> I/O object, submitting it when the folios turn out not to be contiguous,
> or when we run out of a batch?
> 
> You've written an awful lot of code here and it's a different model from
> every other filesystem.  Why is it better?

Because write_cache_pages():

 (1) Takes no account of fscache.  I can't just build knowledge of PG_fscache
     into it because PG_private_2 may be used differently by other filesystems
     (btrfs, for example).  (I'm also trying to phase out the use of
     PG_private_2 and instead uses PG_writeback to cover both and the
     difference will be recorded elsewhere - but that's not there yet).

 (2) Calls ->writepage() individually for each folio - which is excessive.  In
     AFS's implementation, we locate the first folio, then race through the
     following folios without ever waiting until we hit something that's
     locked or a gap and then stop and submit.

     write_cache_pages(), otoh, calls us with the next folio already undirtied
     and set for writeback when we find out that we don't want it yet.

 (3) Skips over holes, but at some point in the future we're going to need to
     schedule adjacent clean pages (before and after) for writeback too to
     handle transport compression and fscache updates if the granule size for
     either is larger than the folio size.

It might be better to take what's in cifs, generalise it and replace
write_cache_pages() with it, then have a "->submit_write()" aop that takes an
ITER_XARRAY iterator to write from.

David