lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 4 Jan 2010 19:13:35 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"minchan.kim@...il.com" <minchan.kim@...il.com>,
	cl@...ux-foundation.org,
	"hugh.dickins" <hugh.dickins@...cali.co.uk>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: [RFC][PATCH 6/8] mm: handle_speculative_fault()



On Tue, 5 Jan 2010, KAMEZAWA Hiroyuki wrote:
> 
> I'm sorry if I miss something...how does this patch series avoid
> that vma is removed while __do_fault()->vma->vm_ops->fault() is called ?
> ("vma is removed" means all other things as freeing file struct etc..)

I don't think you're missing anything. 

Protecting the vma isn't enough. You need to protect the whole FS stack 
with rcu. Probably by moving _all_ of "free_vma()" into the RCU path 
(which means that the whole file/inode gets de-allocated at that later RCU 
point, rather than synchronously). Not just the actual kfree.

However, it's worth noting that that actually has some very subtle and 
major consequences. If you have a temporary file that was removed, where 
the mmap() was the last user that kind of delayed freeing would also delay 
the final fput of that file that actually deletes it. 

Or put another way: if the vma was a writable mapping, a user may do

	munmap(mapping, size);

and the backing file is still active and writable AFTER THE MUNMAP! This 
can be a huge problem for something that wants to unmount the volume, for 
example, or depends on the whole writability-vs-executability thing. The 
user may have unmapped it, and expects the file to be immediately 
non-busy, but with the delayed free that isn't the case any more.

In other words, now you may well need to make munmap() wait for the RCU 
grace period, so that the user who did the unmap really is synchronous wrt 
the file accesses. We've had things like that before, and they have been 
_huge_ performance problems (ie it may take just a timer tick or two, but 
then people do tens of thousands of munmaps, and now that takes many 
seconds just due to RCU grace period waiting.

I would say that this whole series is _very_ far from being mergeable. 
Peter seems to have been thinking about the details, while missing all the 
subtle big picture effects that seem to actually change semantics.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ