linux-kernel - Re: [PATCH 2/7] KAMEZAWA Hiroyuki

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20070531212606.c5acd10c.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Thu, 31 May 2007 21:26:06 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	Hugh Dickins <hugh@...itas.com>
Cc:	mel@....ul.ie, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	clameter@....com
Subject: Re: [PATCH 2/7] KAMEZAWA Hiroyuki - migration by kernel

On Wed, 30 May 2007 20:57:38 +0100 (BST)
Hugh Dickins <hugh@...itas.com> wrote:

> I've taken a look at last.  It looks like a good fix to a real problem,
> but may I suggest a simpler version?  The anon_vma isn't usually held
> by a refcount, but by having a vma on its linked list: why not just
> put a dummy vma into that linked list?  No need to add a refcount.
> 
> The NUMA shmem_alloc_page already uses a dummy vma on its stack,
Oh, I didn't notice that. If dummy-vma works well now, I'll use it.
thank you.

>
> >  static int unmap_and_move(new_page_t get_new_page, unsigned long private,
> > -			struct page *page, int force)
> > +			struct page *page, int force, int nocontext)
> >  {
> 
> An "int context" would be a lot better than the negative "int nocontext";
> even better would be "int holds_mmap_sem".  Or even skip the additional
> argument completely, use the anon_vma_hold method always without relying
> on whether or not mmap_sem is held.  I don't know how significant it is
> to avoid extra locking here: on the one hand we like to avoid unnecessary
> locking; on the other hand there's probably a thousand commoner places in
> the kernel where we could pass down an arg to say, actually you won't
> need to lock in such and such a case.
Hmm, ok. I'd like to try make things simpler.

> 
> >  	int rc = 0;
> >  	int *result = NULL;
> >  	struct page *newpage = get_new_page(page, private, &result);
> > +	struct anon_vma *anon_vma = NULL;
> >  
> >  	if (!newpage)
> >  		return -ENOMEM;
> > @@ -632,17 +633,23 @@ static int unmap_and_move(new_page_t get
> >  			goto unlock;
> >  		wait_on_page_writeback(page);
> >  	}
> > -
> > +	/* hold this anon_vma until page migration ends */
> > +	if (nocontext && PageAnon(page) && page_mapped(page))
> > +		anon_vma = anon_vma_hold(page);
> >  	/*
> >  	 * Establish migration ptes or remove ptes
> >  	 */
> > -	try_to_unmap(page, 1);
> > +	if (page_mapped(page))
> > +		try_to_unmap(page, 1);
> > +
> 
> All these preliminary tests: yes, I suppose they avoid unnecessary
> locking, so I guess they're good; but it should work without them.
> 
> >  	if (!page_mapped(page))
> >  		rc = move_to_new_page(newpage, page);
> >  
> >  	if (rc)
> >  		remove_migration_ptes(page, page);
> >  
> > +	anon_vma_release(anon_vma);
> > +
> >  unlock:
> >  	unlock_page(page);
> >  
> > @@ -686,8 +693,8 @@ move_newpage:
> >   *
> >   * Return: Number of pages not migrated or error code.
> >   */
> > -int migrate_pages(struct list_head *from,
> > -		new_page_t get_new_page, unsigned long private)
> > +int __migrate_pages(struct list_head *from,
> > +		new_page_t get_new_page, unsigned long private, int nocontext)
> >  {
> 
> Remarks on nocontext as above: mmm, I think keep the patch small
> and don't add that extra argument at all.
> 
> >  	int retry = 1;
> >  	int nr_failed = 0;
> > @@ -707,7 +714,7 @@ int migrate_pages(struct list_head *from
> >  			cond_resched();
> >  
> >  			rc = unmap_and_move(get_new_page, private,
> > -						page, pass > 2);
> > +						page, pass > 2, nocontext);
> >  
> >  			switch(rc) {
> >  			case -ENOMEM:
> > @@ -737,6 +744,22 @@ out:
> >  	return nr_failed + retry;
> >  }
> >  
> > +int migrate_pages(struct list_head *from,
> > +	new_page_t get_new_page, unsigned long private)
> > +{
> > +	return __migrate_pages(from, get_new_page, private, 0);
> > +}
> > +
> > +/*
> > + * When page migration is issued by the kernel itself without page mapper's
> > + * mm->sem, we have to be more careful to do page migration.
> > + */
> > +int migrate_pages_nocontext(struct list_head *from,
> > +	new_page_t get_new_page, unsigned long private)
> > +{
> > +	return __migrate_pages(from, get_new_page, private, 1);
> > +}
> > +
> >  #ifdef CONFIG_NUMA
> >  /*
> >   * Move a list of individual pages
> > Index: linux-2.6.22-rc2-mm1/include/linux/rmap.h
> > ===================================================================
> > --- linux-2.6.22-rc2-mm1.orig/include/linux/rmap.h
> > +++ linux-2.6.22-rc2-mm1/include/linux/rmap.h
> > @@ -26,6 +26,9 @@
> >  struct anon_vma {
> >  	spinlock_t lock;	/* Serialize access to vma list */
> >  	struct list_head head;	/* List of private "related" vmas */
> > +#ifdef CONFIG_MIGRATION
> > +	int	ref;	/* special refcnt for migration */
> > +#endif
> >  };
> >  
> >  #ifdef CONFIG_MMU
> > @@ -42,6 +45,14 @@ static inline void anon_vma_free(struct 
> >  	kmem_cache_free(anon_vma_cachep, anon_vma);
> >  }
> >  
> > +#ifdef  CONFIG_MIGRATION
> > +extern struct anon_vma *anon_vma_hold(struct page *page);
> > +extern void anon_vma_release(struct anon_vma *anon_vma);
> > +#else
> > +#define anon_vma_hold(page)     do{}while(0)
> > +#define anon_vma_release(anon)  do{}while(0)
> 
> Rather than change those to "do {} while (0)", to which others
> will ask for static inlines, just delete them, can't you -
> they're simply not needed in the !CONFIG_MIGRATION case, right?
> 
Ok. they are not necessary if !CONFIG_MIGRATION. I'll delete.

Maybe I was confused at deleting CONFIG_MIGRATON_BY_KERNEL...which needed ifdef.
 
Thank you!.
-Kame

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/