linux-kernel - Re: ftruncate-mmap: pages are lost after writing to mmaped file.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200903200334.55710.nickpiggin@yahoo.com.au>
Date:	Fri, 20 Mar 2009 03:34:54 +1100
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Ying Han <yinghan@...gle.com>, Jan Kara <jack@...e.cz>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"linux-kernel" <linux-kernel@...r.kernel.org>,
	"linux-mm" <linux-mm@...ck.org>, guichaz@...il.com,
	Alex Khesin <alexk@...gle.com>,
	Mike Waychison <mikew@...gle.com>,
	Rohit Seth <rohitseth@...gle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: ftruncate-mmap: pages are lost after writing to mmaped file.

On Friday 20 March 2009 03:20:57 Linus Torvalds wrote:
> On Fri, 20 Mar 2009, Nick Piggin wrote:
> > But I think we do have a race in __set_page_dirty_buffers():
> >
> > The page may not have buffers between the mapping->private_lock
> > critical section and the __set_page_dirty call there. So between
> > them, another thread might do a create_empty_buffers which can
> > see !PageDirty and thus it will create clean buffers.
>
> Hmm.
>
> Creating clean buffers is locked by the page lock, nothing else.  And not
> all page dirtiers hold the page lock (in fact, most try to avoid it - the
> rule is that you either have to hold the page lock _or_ hold a reference
> to the 'mapping', and the latter is what the mmap code does, I think).
>
> So yeah, the page lock isn't sufficient.

No. FWIW, I thought there might be a race due to page fault code
not holding page lock over set_page_dirty (well there *are* some kinds
of races, but they're another story). So I tried out my patches that
move that lock over set_page_dirty for __do_fault and do_wp_page
(so the lock is held over pte_mkwrite and set_page_dirty), but that
still didn't solve the problem either.


> > Holding mapping->private_lock over the __set_page_dirty should
> > fix it, although I guess you'd want to release it before calling
> > __mark_inode_dirty so as not to put inode_lock under there. I
> > have a patch for this if it sounds reasonable.
>
> That would seem to make sense. Maybe moving the "TestSetPageDirty()" from
> inside __set_page_dirty() to the caller? Something like the appended?
>
> This is TOTALLY untested. Of course.

Yeah, probably no need to hold private_lock while tagging the radix
tree (which is what my version did). So maybe this one is a little
better. I did test mine, it worked, but it didn't solve the problem.

Still, it does appear to solve a real race, which we should close.

>
> 			Linus
>
> ---
>  fs/buffer.c |   23 +++++++++++------------
>  1 files changed, 11 insertions(+), 12 deletions(-)
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 9f69741..891e1c7 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -760,15 +760,9 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode);
>   * If warn is true, then emit a warning if the page is not uptodate and
> has * not been truncated.
>   */
> -static int __set_page_dirty(struct page *page,
> +static void __set_page_dirty(struct page *page,
>  		struct address_space *mapping, int warn)
>  {
> -	if (unlikely(!mapping))
> -		return !TestSetPageDirty(page);
> -
> -	if (TestSetPageDirty(page))
> -		return 0;
> -
>  	spin_lock_irq(&mapping->tree_lock);
>  	if (page->mapping) {	/* Race with truncate? */
>  		WARN_ON_ONCE(warn && !PageUptodate(page));
> @@ -785,8 +779,6 @@ static int __set_page_dirty(struct page *page,
>  	}
>  	spin_unlock_irq(&mapping->tree_lock);
>  	__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> -
> -	return 1;
>  }
>
>  /*
> @@ -816,6 +808,7 @@ static int __set_page_dirty(struct page *page,
>   */
>  int __set_page_dirty_buffers(struct page *page)
>  {
> +	int newly_dirty;
>  	struct address_space *mapping = page_mapping(page);
>
>  	if (unlikely(!mapping))
> @@ -831,9 +824,12 @@ int __set_page_dirty_buffers(struct page *page)
>  			bh = bh->b_this_page;
>  		} while (bh != head);
>  	}
> +	newly_dirty = !TestSetPageDirty(page);
>  	spin_unlock(&mapping->private_lock);
>
> -	return __set_page_dirty(page, mapping, 1);
> +	if (newly_dirty)
> +		__set_page_dirty(page, mapping, 1);
> +	return newly_dirty;
>  }
>  EXPORT_SYMBOL(__set_page_dirty_buffers);
>
> @@ -1262,8 +1258,11 @@ void mark_buffer_dirty(struct buffer_head *bh)
>  			return;
>  	}
>
> -	if (!test_set_buffer_dirty(bh))
> -		__set_page_dirty(bh->b_page, page_mapping(bh->b_page), 0);
> +	if (!test_set_buffer_dirty(bh)) {
> +		struct page *page = bh->b_page;
> +		if (!TestSetPageDirty(page))
> +			__set_page_dirty(page, page_mapping(page), 0);
> +	}
>  }
>
>  /*


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/