lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 17 Jul 2009 11:39:11 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc:	David Rientjes <rientjes@...gle.com>,
	Lee Schermerhorn <Lee.Schermerhorn@...com>,
	Miao Xie <miaox@...fujitsu.com>, Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Paul Menage <menage@...gle.com>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Yasunori Goto <y-goto@...fujitsu.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	linux-mm <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [BUG] set_mempolicy(MPOL_INTERLEAV) cause kernel panic

On Fri, 17 Jul 2009 11:07:09 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com> wrote:

> > On Fri, 17 Jul 2009 09:04:46 +0900 (JST)
> > KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com> wrote:
> > 
> > > > On Wed, 15 Jul 2009, Lee Schermerhorn wrote:
> > > > 
> > > > > Interestingly, on ia64, the top cpuset mems_allowed gets set to all
> > > > > possible nodes, while on x86_64, it gets set to on-line nodes [or nodes
> > > > > with memory].  Maybe this is a to support hot-plug?
> > > > > 
> > > > 
> > > > numactl --interleave=all simply passes a nodemask with all bits set, so if 
> > > > cpuset_current_mems_allowed includes offline nodes from node_possible_map, 
> > > > then mpol_set_nodemask() doesn't mask them off.
> > > > 
> > > > Seems like we could handle this strictly in mempolicies without worrying 
> > > > about top_cpuset like in the following?
> > > 
> > > This patch seems band-aid patch. it will change memory-hotplug behavior.
> > > Please imazine following scenario:
> > > 
> > > 1. numactl interleave=all process-A
> > > 2. memory hot-add
> > > 
> > > before 2.6.30:
> > > 		-> process-A can use hot-added memory
> > > 
> > > your proposal patch:
> > > 		-> process-A can't use hot-added memory
> > > 
> > 
> > IMHO, the application itseld should be notifed to change its mempolicy by
> > hot-plug script on the host. While an application uses interleave, a new node
> > hot-added is just a noise. I think "How pages are interleaved" should not be
> > changed implicitly. Then, checking at set_mempolicy() seems sane. If notified,
> > application can do page migration and rebuild his mapping in ideal way.
> 
> Do you really want ABI change?
> 
No ;_

Hmm, IIUC, current handling of nodemask of mempolicy is below.
There should be 3 masks.
  - systems's N_HIGH_MEMORY
  - the mask user specified via mempolicy() (remembered only when MPOL_F_RELATIVE
  - cpusets's one

And pol->v.nodes is just a _cache_ of logical-and of aboves.
Synchronization with cpusets is guaranteed by cpuset's generation.
Synchronization with N_HIGH_MEMORY should be guaranteed by memory hotplug
notifier, but this is not implemented yet.

Then, what I can tell here is...
 - remember what's user requested. (only when MPOL_F_RELATIVE_NODES ?)
 - add notifiers for memory hot-add. (only when MPOL_F_RELATIVE_NODES ?)
 - add notifiers for memory hot-remove (both MPOL_F_STATIC/RELATIVE_NODES ?)

IMHO, for cpusets, don't calculate v.nodes again if MPOL_F_STATIC is good.
But for N_HIGH_MEMORY, v.nodes should be caluculated even if MPOL_F_STATIC is set.

Then, I think the mask user passed should be remembered even if MPOL_F_STATIC is
set and v.nodes should work as cache and should be updated in appropriate way.

Thanks,
-Kame












--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ