lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141219062405.GA11486@mew>
Date:	Thu, 18 Dec 2014 22:24:05 -0800
From:	Omar Sandoval <osandov@...ndov.com>
To:	Al Viro <viro@...IV.linux.org.uk>
Cc:	Christoph Hellwig <hch@...radead.org>, Jan Kara <jack@...e.cz>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Trond Myklebust <trond.myklebust@...marydata.com>,
	David Sterba <dsterba@...e.cz>, linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org, linux-nfs@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/8] swap: lock i_mutex for swap_writepage direct_IO

On Wed, Dec 17, 2014 at 10:03:13PM +0000, Al Viro wrote:
> On Wed, Dec 17, 2014 at 10:52:56AM -0800, Christoph Hellwig wrote:
> > On Wed, Dec 17, 2014 at 06:58:32AM -0800, Omar Sandoval wrote:
> > > See my previous message. If we use O_DIRECT on the original open, then
> > > filesystems that implement bmap but not direct_IO will no longer work.
> > > These are the ones that I found in my tree:
> > 
> > In the long run I don't think they are worth keeping.  But to keep you
> > out of that discussion you can just try an open without O_DIRECT if the
> > open with the flag failed.
> 
> Umm...  That's one possibility, of course (and if swapon(2) is on someone's
> hotpath, I really would like to see what the hell they are doing - it has
> to be interesting in a sick way).

If this is the approach you'd prefer, I'll go ahead and do that for v2.
I personally think it looks pretty kludgey, but I'm fine either way:

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 63f55cc..c1b3073 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2379,7 +2379,16 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
                name = NULL;
                goto bad_swap;
        }
-       swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0);
+       swap_file = file_open_name(name, O_RDWR | O_LARGEFILE | O_DIRECT, 0);
+       if (IS_ERR(swap_file) && PTR_ERR(swap_file) == -EINVAL)
+               swap_file = file_open_name(name, O_RDWR | O_LARGEFILE, 0);
        if (IS_ERR(swap_file)) {
                error = PTR_ERR(swap_file);
                swap_file = NULL;

> BTW, speaking of read/write vs. swap - what's the story with e.g. AFS
> write() checking IS_SWAPFILE() and failing with -EBUSY?  Note that
> 	* it's done before acquiring i_mutex, so it isn't race-free
> 	* it's dubious from the POSIX POV - EBUSY isn't in the error
> list for write(2).
> 	* other filesystems generally don't have anything of that sort.
> NFS does, but local ones do not...
> Besides, do we even allow swapfiles on AFS?

AFS doesn't implement ->bmap or ->swap_activate, so that code is dead,
probably cargo-culted from the NFS code. It seems pretty pointless, not
only because it's inconsistent with the local filesystems like you
mentioned, but also because it's trivial to bypass with O_DIRECT on NFS:

ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
{
	struct file *file = iocb->ki_filp;
	struct inode *inode = file_inode(file);
	unsigned long written = 0;
	ssize_t result;
	size_t count = iov_iter_count(from);
	loff_t pos = iocb->ki_pos;

	result = nfs_key_timeout_notify(file, inode);
	if (result)
		return result;

	if (file->f_flags & O_DIRECT)
		return nfs_file_direct_write(iocb, from, pos);

	dprintk("NFS: write(%pD2, %zu@%Ld)\n",
		file, count, (long long) pos);

	result = -EBUSY;
	if (IS_SWAPFILE(inode))
		goto out_swapfile;

I think it's safe to scrap that code. However, this also led me to find that
NFS doesn't prevent truncates on an active swapfile. I'm submitting a patch for
that now.

-- 
Omar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ