lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2f933d26-541c-06eb-e5d3-336c05f31f1d@opensource.wdc.com>
Date:   Mon, 27 Sep 2021 10:41:04 +0900
From:   Damien Le Moal <damien.lemoal@...nsource.wdc.com>
To:     Dave Chinner <david@...morbit.com>
Cc:     Matthew Wilcox <willy@...radead.org>,
        David Howells <dhowells@...hat.com>, hch@....de,
        trond.myklebust@...marydata.com, Jens Axboe <axboe@...nel.dk>,
        "Darrick J. Wong" <djwong@...nel.org>, linux-block@...r.kernel.org,
        linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-mm@...ck.org, darrick.wong@...cle.com,
        viro@...iv.linux.org.uk, jlayton@...nel.org,
        torvalds@...ux-foundation.org, linux-nfs@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 9/9] mm: Remove swap BIO paths and only use DIO paths

On 2021/09/27 10:25, Dave Chinner wrote:
> On Mon, Sep 27, 2021 at 08:08:53AM +0900, Damien Le Moal wrote:
>> On 2021/09/26 2:09, Matthew Wilcox wrote:
>>> On Sat, Sep 25, 2021 at 04:36:42PM +0100, David Howells wrote:
>>>> Matthew Wilcox <willy@...radead.org> wrote:
>>>>
>>>>> On Fri, Sep 24, 2021 at 06:19:23PM +0100, David Howells wrote:
>>>>>> Delete the BIO-generating swap read/write paths and always use ->swap_rw().
>>>>>> This puts the mapping layer in the filesystem.
>>>>>
>>>>> Is SWP_FS_OPS now unused after this patch?
>>>>
>>>> Ummm.  Interesting question - it's only used in swap_set_page_dirty():
>>>>
>>>> int swap_set_page_dirty(struct page *page)
>>>> {
>>>> 	struct swap_info_struct *sis = page_swap_info(page);
>>>>
>>>> 	if (data_race(sis->flags & SWP_FS_OPS)) {
>>>> 		struct address_space *mapping = sis->swap_file->f_mapping;
>>>>
>>>> 		VM_BUG_ON_PAGE(!PageSwapCache(page), page);
>>>> 		return mapping->a_ops->set_page_dirty(page);
>>>> 	} else {
>>>> 		return __set_page_dirty_no_writeback(page);
>>>> 	}
>>>> }
>>>
>>> I suspect that's no longer necessary.  NFS was the only filesystem
>>> using SWP_FS_OPS and ...
>>>
>>> fs/nfs/file.c:  .set_page_dirty = __set_page_dirty_nobuffers,
>>>
>>> so it's not like NFS does anything special to reserve memory to write
>>> back swap pages.
>>>
>>>>> Also, do we still need ->swap_activate and ->swap_deactivate?
>>>>
>>>> f2fs does quite a lot of work in its ->swap_activate(), as does btrfs.  I'm
>>>> not sure how necessary it is.  cifs looks like it intends to use it, but it's
>>>> not fully implemented yet.  zonefs and nfs do some checking, including hole
>>>> checking in nfs's case.  nfs also does some setting up for the sunrpc
>>>> transport.
>>>>
>>>> btrfs, cifs, f2fs and nfs all supply ->swap_deactivate() to undo the effects
>>>> of the activation.
>>>
>>> Right ... so my question really is, now that we're doing I/O through
>>> aops->direct_IO (or ->swap_rw), do those magic things need to be done?
>>> After all, open(O_DIRECT) doesn't do these same magic things.  They're
>>> really there to allow the direct-to-BIO path to work, and you're removing
>>> that here.
>>
>> For zonefs, ->swap_activate() checks that the user is not trying to use a
>> sequential write only file for swap. Swap cannot work on these files as there
>> are no guarantees that the writes will be sequential.
> 
> iomap_swapfile_activate() is used by ext4, XFS and zonefs. It checks
> there are no holes in the file, no shared extents, no inline
> extents, the swap info block device matches the block device the
> extent is mapped to (i.e. filesystems can have more than one bdev,
> swapfile only supports files on sb->s_bdev), etc.

OK. But I was referring to the additional check in zonefs_swap_activate() before
iomap_swapfile_activate() is called. We must prevent that function from being
called for a full sequential write only zone file since such file will pass all
checks (no hole, all extents written etc) but cannot be used for swap since it
is not writtable when full (no overwrites allowed in sequential zones).

> 
> Also, I noticed, iomap_swapfile_add_extent() filters out extents
> that are smaller than PAGE_SIZE, and aligns larger extents to
> PAGE_SIZE. This allows ensures that when fs block size != PAGE_SIZE
> that only a single IO per page being swapped is required. i.e. the
> DIO path may change the "one page, one bio, one IO" behaviour that
> the current swapfile mapping guarantees.
> 
> Cheers,
> 
> Dave.
> 


-- 
Damien Le Moal
Western Digital Research

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ