lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061009211013.GP6485@ca-server1.us.oracle.com>
Date:	Mon, 9 Oct 2006 14:10:13 -0700
From:	Mark Fasheh <mark.fasheh@...cle.com>
To:	Nick Piggin <npiggin@...e.de>
Cc:	Hugh Dickins <hugh@...itas.com>,
	Linux Memory Management <linux-mm@...ck.org>,
	Andrew Morton <akpm@...l.org>, Jes Sorensen <jes@....com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: [patch 2/5] mm: fault vs invalidate/truncate race fix

Hi Nick,

On Mon, Oct 09, 2006 at 06:12:26PM +0200, Nick Piggin wrote:
> Complexity and documentation issues aside, the locking protocol fails
> in the case where we would like to invalidate pagecache inside i_size.
That pretty much describes part of what ocfs2_data_convert_worker() does.
It's called when another node wants to take a lock at an incompatible level
on an inodes data.

This involves up to two steps, depending on the level of the lock requested.

1) It always syncs dirty data.

2) If it's dropping due to writes on another node, then pages will be
   invalidated and mappings torn down.


There's actually an ocfs2 patch to support shared writeable mappings in via
the ->page_mkwrite() callback, but I haven't pushed it upstream due to a bug
I found during some later testing. I believe the bug is a VM issue, and your
description of the race Andrea identified leads me to wonder if you all
might have just found it and fixed it for me :)


In short, I have an MPI test program which rotates through a set of
processes which have mmaped a pre-formatted file. One process writes some
data, the rest verify that they see the new data. When I run multiple
processes on multiple nodes, I will sometimes find that one of the processes
fails because it sees stale data.


FWIW, the overall approach taken in the patch below seems fine to me, though
I'm no VM expert :)

Not having ocfs2_data_convert_worker() call unmap_mapping_range() directly,
is ok as long as the intent of the function is preserved. You seem to be
doing this by having truncate_inode_pages() unmap instead.

Thanks,
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@...cle.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ