lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120705130902.GF7881@cmpxchg.org>
Date:	Thu, 5 Jul 2012 15:09:02 +0200
From:	Johannes Weiner <hannes@...xchg.org>
To:	Rik van Riel <riel@...hat.com>
Cc:	Andrea Arcangeli <aarcange@...hat.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Hillf Danton <dhillf@...il.com>, Dan Smith <danms@...ibm.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>, Paul Turner <pjt@...gle.com>,
	Suresh Siddha <suresh.b.siddha@...el.com>,
	Mike Galbraith <efault@....de>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Bharata B Rao <bharata.rao@...il.com>,
	Lee Schermerhorn <Lee.Schermerhorn@...com>,
	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
	Christoph Lameter <cl@...ux.com>,
	Alex Shi <alex.shi@...el.com>,
	Mauricio Faria de Oliveira <mauricfo@...ux.vnet.ibm.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Don Morris <don.morris@...com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Subject: Re: [PATCH 09/40] autonuma: introduce kthread_bind_node()

On Fri, Jun 29, 2012 at 12:58:01PM -0400, Rik van Riel wrote:
> On 06/29/2012 12:38 PM, Andrea Arcangeli wrote:
> >On Fri, Jun 29, 2012 at 11:36:26AM -0400, Rik van Riel wrote:
> >>On 06/28/2012 08:55 AM, Andrea Arcangeli wrote:
> >>
> >>>--- a/include/linux/sched.h
> >>>+++ b/include/linux/sched.h
> >>>@@ -1792,7 +1792,7 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
> >>>   #define PF_SWAPWRITE	0x00800000	/* Allowed to write to swap */
> >>>   #define PF_SPREAD_PAGE	0x01000000	/* Spread page cache over cpuset */
> >>>   #define PF_SPREAD_SLAB	0x02000000	/* Spread some slab caches over cpuset */
> >>>-#define PF_THREAD_BOUND	0x04000000	/* Thread bound to specific cpu */
> >>>+#define PF_THREAD_BOUND	0x04000000	/* Thread bound to specific cpus */
> >>>   #define PF_MCE_EARLY    0x08000000      /* Early kill for mce process policy */
> >>>   #define PF_MEMPOLICY	0x10000000	/* Non-default NUMA mempolicy */
> >>>   #define PF_MUTEX_TESTER	0x20000000	/* Thread belongs to the rt mutex tester */
> >>
> >>Changing the semantics of PF_THREAD_BOUND without so much as
> >>a comment in your changelog or buy-in from the scheduler
> >>maintainers is a big no-no.
> >>
> >>Is there any reason you even need PF_THREAD_BOUND in your
> >>kernel numa threads?
> >>
> >>I do not see much at all in the scheduler code that uses
> >>PF_THREAD_BOUND and it is not clear at all that your
> >>numa threads get any benefit from them...
> >>
> >>Why do you think you need it?
> 
> >This flag is only used to prevent userland to mess with the kernel CPU
> >binds of kernel threads. It is used to avoid the root user to shoot
> >itself in the foot.
> >
> >So far it has been used to prevent changing bindings to a single
> >CPU. I'm setting it also after making a multiple-cpu bind (all CPUs of
> >the node, instead of just 1 CPU).
> 
> Fair enough.  Looking at the scheduler code some more, I
> see that all PF_THREAD_BOUND seems to do is block userspace
> from changing a thread's CPU bindings.
> 
> Peter and Ingo, what is the special magic in PF_THREAD_BOUND
> that should make it only apply to kernel threads that are bound
> to a single CPU?

In the very first review iteration of AutoNUMA, Peter argued that the
scheduler people want to use this flag in other places where they rely
on this thing meaning a single cpu, not a group of them (check out the
cpumask test in debug_smp_processor_id() in lib/smp_processor_id.c).

He also argued that preventing root from rebinding the numa daemons is
not critical to this feature at all.  And I have to agree.

I certainly think this is NOT the change to make a stand about in this
patch set, seriously.  Not about a nice-to-have thing like this that
doesn't really hurt dropping but does create contention.

It can always be a separate effort to bring in such a flag that would
allow it to be used by other daemons, but this really should be a
separate effort and I don't think anything is really lost by dropping
the change from this series.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ