lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3085432.1628773025@warthog.procyon.org.uk>
Date:   Thu, 12 Aug 2021 13:57:05 +0100
From:   David Howells <dhowells@...hat.com>
To:     Christoph Hellwig <hch@....de>
Cc:     dhowells@...hat.com, willy@...radead.org,
        trond.myklebust@...marydata.com, darrick.wong@...cle.com,
        jlayton@...nel.org, sfrench@...ba.org,
        torvalds@...ux-foundation.org, linux-nfs@...r.kernel.org,
        linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] mm: Make swap_readpage() for SWP_FS_OPS use ->direct_IO() not ->readpage()

Christoph Hellwig <hch@....de> wrote:

> On Thu, Aug 12, 2021 at 12:57:58PM +0100, David Howells wrote:
> > Make swap_readpage(), when accessing a swap file (SWP_FS_OPS) use
> > the ->direct_IO() method on the filesystem rather then ->readpage().
> 
> ->direct_IO is just a helper for ->read_iter and ->write_iter, so please
> don't call it directly.  It actually is slowly on its way out, with at
> at least all of the iomap implementations not using it, as well as various
> other file systems.

[Note that __swap_writepage() uses ->direct_IO().]

Calling ->write_iter is probably a bad idea here.  Imagine that it goes
through, say, generic_file_write_iter(), then __generic_file_write_iter() and
then generic_file_direct_write().  It adds a number of delays into the system,
including:

	- Taking the inode lock
	- Removing file privs
	- Cranking mtime, ctime, file version
	  - Doing mnt_want_write
	  - Setting the inode dirty
	- Waiting on pages in the range that are being written 
	- Walking over the pagecache to invalidate the range
	- Redoing the invalidation (can't be skipped since page 0 is pinned)

that we might want to skip as they'll end up being done for every page swapped
out.

> > +	ki = kzalloc(sizeof(*ki), GFP_KERNEL);
> > +	if (!ki)
> > +		return -ENOMEM;
> 
> for the synchronous case we could avoid this allocation and just use
> arguments on stack.

True.

David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ