lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YRVAvKPn8SjczqrD@casper.infradead.org>
Date:   Thu, 12 Aug 2021 16:39:40 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     David Howells <dhowells@...hat.com>
Cc:     Christoph Hellwig <hch@....de>, trond.myklebust@...marydata.com,
        darrick.wong@...cle.com, jlayton@...nel.org, sfrench@...ba.org,
        torvalds@...ux-foundation.org, linux-nfs@...r.kernel.org,
        linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] mm: Make swap_readpage() for SWP_FS_OPS use
 ->direct_IO() not ->readpage()

On Thu, Aug 12, 2021 at 01:57:05PM +0100, David Howells wrote:
> Christoph Hellwig <hch@....de> wrote:
> 
> > On Thu, Aug 12, 2021 at 12:57:58PM +0100, David Howells wrote:
> > > Make swap_readpage(), when accessing a swap file (SWP_FS_OPS) use
> > > the ->direct_IO() method on the filesystem rather then ->readpage().
> > 
> > ->direct_IO is just a helper for ->read_iter and ->write_iter, so please
> > don't call it directly.  It actually is slowly on its way out, with at
> > at least all of the iomap implementations not using it, as well as various
> > other file systems.
> 
> [Note that __swap_writepage() uses ->direct_IO().]
> 
> Calling ->write_iter is probably a bad idea here.  Imagine that it goes
> through, say, generic_file_write_iter(), then __generic_file_write_iter() and
> then generic_file_direct_write().  It adds a number of delays into the system,
> including:
> 
> 	- Taking the inode lock
> 	- Removing file privs
> 	- Cranking mtime, ctime, file version
> 	  - Doing mnt_want_write
> 	  - Setting the inode dirty
> 	- Waiting on pages in the range that are being written 
> 	- Walking over the pagecache to invalidate the range
> 	- Redoing the invalidation (can't be skipped since page 0 is pinned)
> 
> that we might want to skip as they'll end up being done for every page swapped
> out.

I agree with David; we want something lower-level for swap to call into.
I'd suggest aops->swap_rw and an implementation might well look
something like:

static ssize_t ext4_swap_rw(struct kiocb *iocb, struct iov_iter *iter)
{
	return iomap_dio_rw(iocb, iter, &ext4_iomap_ops, NULL, 0);
}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ