lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121101162254.03dbbd9a@tlielax.poochiereds.net>
Date:	Thu, 1 Nov 2012 16:22:54 -0400
From:	Jeff Layton <jlayton@...ba.org>
To:	Boaz Harrosh <bharrosh@...asas.com>
Cc:	"Darrick J. Wong" <darrick.wong@...cle.com>, <axboe@...nel.dk>,
	<lucho@...kov.net>, <tytso@....edu>, <sage@...tank.com>,
	<ericvh@...il.com>, <mfasheh@...e.com>, <dedekind1@...il.com>,
	<adrian.hunter@...el.com>, <dhowells@...hat.com>,
	<sfrench@...ba.org>, <jlbec@...lplan.org>, <rminnich@...dia.gov>,
	<linux-cifs@...r.kernel.org>, <jack@...e.cz>,
	<martin.petersen@...cle.com>, <neilb@...e.de>,
	<david@...morbit.com>, <linux-kernel@...r.kernel.org>,
	<linux-mm@...ck.org>, <linux-mtd@...ts.infradead.org>,
	<linux-fsdevel@...r.kernel.org>,
	<v9fs-developer@...ts.sourceforge.net>,
	<ceph-devel@...r.kernel.org>, <linux-ext4@...r.kernel.org>,
	<linux-afs@...ts.infradead.org>, <ocfs2-devel@....oracle.com>
Subject: Re: [PATCH 3/3] fs: Fix remaining filesystems to wait for stable
 page writeback

On Thu, 1 Nov 2012 11:43:26 -0700
Boaz Harrosh <bharrosh@...asas.com> wrote:

> On 11/01/2012 12:58 AM, Darrick J. Wong wrote:
> > Fix up the filesystems that provide their own ->page_mkwrite handlers to
> > provide stable page writes if necessary.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@...cle.com>
> > ---
> >  fs/9p/vfs_file.c |    1 +
> >  fs/afs/write.c   |    4 ++--
> >  fs/ceph/addr.c   |    1 +
> >  fs/cifs/file.c   |    1 +
> >  fs/ocfs2/mmap.c  |    1 +
> >  fs/ubifs/file.c  |    4 ++--
> >  6 files changed, 8 insertions(+), 4 deletions(-)
> > 
> > 
> > diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
> > index c2483e9..aa253f0 100644
> > --- a/fs/9p/vfs_file.c
> > +++ b/fs/9p/vfs_file.c
> > @@ -620,6 +620,7 @@ v9fs_vm_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  	lock_page(page);
> >  	if (page->mapping != inode->i_mapping)
> >  		goto out_unlock;
> > +	wait_on_stable_page_write(page);
> >  
> 
> Good god thanks, yes please ;-)
> 
> >  	return VM_FAULT_LOCKED;
> >  out_unlock:
> > diff --git a/fs/afs/write.c b/fs/afs/write.c
> > index 9aa52d9..39eb2a4 100644
> > --- a/fs/afs/write.c
> > +++ b/fs/afs/write.c
> > @@ -758,7 +758,7 @@ int afs_page_mkwrite(struct vm_area_struct *vma, struct page *page)
> 
> afs, is it not a network filesystem? which means that it has it's own emulated none-block-device
> BDI, registered internally. So if you do need stable pages someone should call
> bdi_require_stable_pages()
> 
> But again since it is a network filesystem I don't see how it is needed, and/or it might be
> taken care of already.
> 
> >  #ifdef CONFIG_AFS_FSCACHE
> >  	fscache_wait_on_page_write(vnode->cache, page);
> >  #endif
> > -
> > +	wait_on_stable_page_write(page);
> >  	_leave(" = 0");
> > -	return 0;
> > +	return VM_FAULT_LOCKED;
> >  }
> > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> 
> CEPH for sure has it's own "emulated none-block-device BDI". This one is also
> a pure networking filesystem.
> 
> And it already does what it needs to do with wait_on_writeback().
> 
> So i do not think you should touch CEPH
> 
> > index 6690269..e9734bf 100644
> > --- a/fs/ceph/addr.c
> > +++ b/fs/ceph/addr.c
> > @@ -1208,6 +1208,7 @@ static int ceph_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  		set_page_dirty(page);
> >  		up_read(&mdsc->snap_rwsem);
> >  		ret = VM_FAULT_LOCKED;
> > +		wait_on_stable_page_write(page);
> >  	} else {
> >  		if (ret == -ENOMEM)
> >  			ret = VM_FAULT_OOM;
> > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> 
> Cifs also self-BDI network filesystem, but
> 
> > index edb25b4..a8770bf 100644
> > --- a/fs/cifs/file.c
> > +++ b/fs/cifs/file.c
> > @@ -2997,6 +2997,7 @@ cifs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  	struct page *page = vmf->page;
> >  
> >  	lock_page(page);
> 
> It waits by locking the page, that's cifs naive way of waiting for writeback
> 
> > +	wait_on_stable_page_write(page);
> 
> Instead it could do better and not override page_mkwrite at all, and all it needs
> to do is call bdi_require_stable_pages() at it's own registered BDI
> 

Hmm...I don't know...

I've never been crazy about using the page lock for this, but in the
absence of a better way to guarantee stable pages, it was what I ended
up with at the time. cifs_writepages will hold the page lock until
kernel_sendmsg returns. At that point the TCP layer will have copied
off the page data so it's safe to release it.

With this change though, we're going to end up blocking until the
writeback flag clears, right? And I think that will happen when the
reply comes in? So, we'll end up blocking for much longer than is
really necessary in page_mkwrite with this change.

-- 
Jeff Layton <jlayton@...ba.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ