lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0803071155100.6815@schroedinger.engr.sgi.com>
Date:	Fri, 7 Mar 2008 12:00:26 -0800 (PST)
From:	Christoph Lameter <clameter@....com>
To:	Andrea Arcangeli <andrea@...ranet.com>
cc:	Jack Steiner <steiner@....com>, Nick Piggin <npiggin@...e.de>,
	akpm@...ux-foundation.org, Robin Holt <holt@....com>,
	Avi Kivity <avi@...ranet.com>, kvm-devel@...ts.sourceforge.net,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	general@...ts.openfabrics.org,
	Steve Wise <swise@...ngridcomputing.com>,
	Roland Dreier <rdreier@...co.com>,
	Kanoj Sarcar <kanojsarcar@...oo.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	daniel.blueman@...drics.com
Subject: Re: [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier
 methods to sleep (#v9 was 1/4)

On Fri, 7 Mar 2008, Andrea Arcangeli wrote:

> This combines the non-sleep-capable RCU locking of #v9 with a seqlock
> so the mmu notifier fast path will require zero cacheline
> writes/bouncing while still providing mmu_notifier_unregister and
> allowing to schedule inside the mmu notifier methods. If we drop
> mmu_notifier_unregister we can as well drop all seqlock and
> rcu_read_lock()s. But this locking scheme combination is sexy enough
> and 100% scalable (the mmu_notifier_list cacheline will be preloaded
> anyway and that will most certainly include the sequence number value
> in l1 for free even in Christoph's NUMA systems) so IMHO it worth to
> keep mmu_notifier_unregister.

Well its adds lots of processing. Not sure if its really worth it. Seems 
that this scheme cannot work since the existence of the structure passed 
to the callbacks is not guaranteed since the RCU locks are not held. You 
need some kind of a refcount to give the existence guarantee.

> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -20,7 +20,9 @@ void __mmu_notifier_release(struct mm_st
>  void __mmu_notifier_release(struct mm_struct *mm)
>  {
>  	struct mmu_notifier *mn;
> +	unsigned seq;
>  
> +	seq = read_seqbegin(&mm->mmu_notifier_lock);
>  	while (unlikely(!hlist_empty(&mm->mmu_notifier_list))) {
>  		mn = hlist_entry(mm->mmu_notifier_list.first,
>  				 struct mmu_notifier,
> @@ -28,6 +30,7 @@ void __mmu_notifier_release(struct mm_st
>  		hlist_del(&mn->hlist);
>  		if (mn->ops->release)
>  			mn->ops->release(mn, mm);
> +		BUG_ON(read_seqretry(&mm->mmu_notifier_lock, seq));
>  	}
>  }

So this is only for sanity checking? The BUG_ON detects concurrent 
operations that should not happen? Need a comment here.


> @@ -42,11 +45,19 @@ int __mmu_notifier_clear_flush_young(str
>  	struct mmu_notifier *mn;
>  	struct hlist_node *n;
>  	int young = 0;
> +	unsigned seq;
>  
>  	rcu_read_lock();
> +restart:
> +	seq = read_seqbegin(&mm->mmu_notifier_lock);
>  	hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
> -		if (mn->ops->clear_flush_young)
> +		if (mn->ops->clear_flush_young) {
> +			rcu_read_unlock();
>  			young |= mn->ops->clear_flush_young(mn, mm, address);
> +			rcu_read_lock();
> +		}
> +		if (read_seqretry(&mm->mmu_notifier_lock, seq))
> +			goto restart;

Great innovative idea of the seqlock for versioning checks.

>  	}
>  	rcu_read_unlock();
>  

Well that gets pretty sophisticated here. If you drop the rcu lock then 
the entity pointed to by mn can go away right? So how can you pass that 
structure to clear_flush_young? What is guaranteeing the existence of the 
structure?


> @@ -58,11 +69,19 @@ void __mmu_notifier_invalidate_page(stru
>  {
>  	struct mmu_notifier *mn;
>  	struct hlist_node *n;
> +	unsigned seq;
>  
>  	rcu_read_lock();
> +restart:
> +	seq = read_seqbegin(&mm->mmu_notifier_lock);
>  	hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
> -		if (mn->ops->invalidate_page)
> +		if (mn->ops->invalidate_page) {
> +			rcu_read_unlock();
>  			mn->ops->invalidate_page(mn, mm, address);

Ditto structure can vanish since no existence guarantee exists.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ