linux-kernel - Re: [RFC][PATCH] lru_add_drain_all() don't use schedule_on_each

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20081027145509.ebffcf0e.akpm@linux-foundation.org>
Date:	Mon, 27 Oct 2008 14:55:09 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc:	heiko.carstens@...ibm.com, kosaki.motohiro@...fujitsu.com,
	npiggin@...e.de, linux-kernel@...r.kernel.org, hugh@...itas.com,
	torvalds@...ux-foundation.org, riel@...hat.com,
	lee.schermerhorn@...com, linux-mm@...ck.org,
	cl@...ux-foundation.org
Subject: Re: [RFC][PATCH] lru_add_drain_all() don't use
 schedule_on_each_cpu()

On Fri, 24 Oct 2008 00:00:17 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com> wrote:

> Hi Heiko,
> 
> > >> I think the following part of your patch:
> > >>
> > >>> diff --git a/mm/swap.c b/mm/swap.c
> > >>> index fee6b97..bc58c13 100644
> > >>> --- a/mm/swap.c
> > >>> +++ b/mm/swap.c
> > >>> @@ -278,7 +278,7 @@ void lru_add_drain(void)
> > >>>       put_cpu();
> > >>>  }
> > >>>
> > >>> -#ifdef CONFIG_NUMA
> > >>> +#if defined(CONFIG_NUMA) || defined(CONFIG_UNEVICTABLE_LRU)
> > >>>  static void lru_add_drain_per_cpu(struct work_struct *dummy)
> > >>>  {
> > >>>       lru_add_drain();
> > >>
> > >> causes this (allyesconfig on s390):
> > >
> > > hm,
> > >
> > > I don't think so.
> > >
> > > Actually, this patch has
> > >   mmap_sem -> lru_add_drain_all() dependency.
> > >
> > > but its dependency already exist in another place.
> > > example,
> > >
> > >  sys_move_pages()
> > >      do_move_pages()  <- down_read(mmap_sem)
> > >          migrate_prep()
> > >               lru_add_drain_all()

Can we fix that instead?

> ...
>
> It because following three circular locking dependency.
> 
> Some VM place has
>       mmap_sem -> kevent_wq via lru_add_drain_all()
> 
> net/core/dev.c::dev_ioctl()  has
>      rtnl_lock  ->  mmap_sem        (*) the ioctl has copy_from_user() and it can do page fault.
> 
> linkwatch_event has
>      kevent_wq -> rtnl_lock
> 
> 
> Actually, schedule_on_each_cpu() is very problematic function.
> it introduce the dependency of all worker on keventd_wq, 
> but we can't know what lock held by worker in kevend_wq because
> keventd_wq is widely used out of kernel drivers too.
> 
> So, the task of any lock held shouldn't wait on keventd_wq.
> Its task should use own special purpose work queue.
> 

Or we change the callers of lru_add_drain_all() to call it without
holding any locks.  I mean, what's the *point* in calling it with
mmap_sem held?  That won't stop threads from adding new pages into the
pagevecs.


>  #endif
> +
> +	vm_wq = create_workqueue("vm_work");
> +	BUG_ON(!vm_wq);
> +
>  }

Because it's pretty sad to add yet another kernel thread on each CPU
(thousands!) just because of some obscure theoretical deadlock in
page-migration and memory-hotplug.  Most people don't even use those.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/