lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130417074835.GB31607@gmail.com>
Date:	Wed, 17 Apr 2013 09:48:35 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Robin Holt <holt@....com>
Cc:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...hat.com>, Russ Anderson <rja@....com>,
	Shawn Guo <shawn.guo@...aro.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"H. Peter Anvin" <hpa@...or.com>, Joe Perches <joe@...ches.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Michel Lespinasse <walken@...gle.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Paul Mackerras <paulus@...ba.org>,
	Peter Zijlstra <peterz@...radead.org>,
	"rusty@...tcorp.com.au" <rusty@...tcorp.com.au>,
	Tejun Heo <tj@...nel.org>,
	the arch/x86 maintainers <x86@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [Patch -v4 1/4] Migrate shutdown/reboot to boot cpu.


* Robin Holt <holt@....com> wrote:

> On Tue, Apr 16, 2013 at 09:18:07PM +0530, Srivatsa S. Bhat wrote:
> > On 04/16/2013 05:36 PM, Robin Holt wrote:
> > > On Tue, Apr 16, 2013 at 01:32:56PM +0200, Ingo Molnar wrote:
> > >>
> > >> * Robin Holt <holt@....com> wrote:
> > >>
> > >>> We recently noticed that reboot of a 1024 cpu machine takes approx 16
> > >>> minutes of just stopping the cpus.  The slowdown was tracked to commit
> > >>> f96972f.
> > >>>
> > >>> The current implementation does all the work of hot removing the cpus
> > >>> before halting the system.  We are switching to just migrating to the
> > >>> boot cpu and then continuing with shutdown/reboot.
> > >>>
> > >>> This also has the effect of not breaking x86's command line parameter for
> > >>> specifying the reboot cpu.  Note, this code was shamelessly copied from
> > >>> arch/x86/kernel/reboot.c with bits removed pertaining to the reboot_cpu
> > >>> command line parameter.
> > >>>
> > >>> Signed-off-by: Robin Holt <holt@....com>
> > >>> Tested-by: Shawn Guo <shawn.guo@...aro.org>
> > >>> To: Ingo Molnar <mingo@...hat.com>
> > >>> To: Russ Anderson <rja@....com>
> > >>> To: Oleg Nesterov <oleg@...hat.com>
> > >>> Cc: Andrew Morton <akpm@...ux-foundation.org>
> > >>> Cc: "H. Peter Anvin" <hpa@...or.com>
> > >>> Cc: Lai Jiangshan <laijs@...fujitsu.com>
> > >>> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> > >>> Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
> > >>> Cc: Michel Lespinasse <walken@...gle.com>
> > >>> Cc: Oleg Nesterov <oleg@...hat.com>
> > >>> Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
> > >>> Cc: Paul Mackerras <paulus@...ba.org>
> > >>> Cc: Peter Zijlstra <peterz@...radead.org>
> > >>> Cc: Robin Holt <holt@....com>
> > >>> Cc: "rusty@...tcorp.com.au" <rusty@...tcorp.com.au>
> > >>> Cc: Tejun Heo <tj@...nel.org>
> > >>> Cc: the arch/x86 maintainers <x86@...nel.org>
> > >>> Cc: Thomas Gleixner <tglx@...utronix.de>
> > >>> Cc: <stable@...r.kernel.org>
> > >>>
> > >>> ---
> > >>>
> > >>> Changes since -v1.
> > >>> - Set PF_THREAD_BOUND before migrating to eliminate potential race.
> > >>> - Modified kernel_power_off to also migrate instead of using
> > >>>   disable_nonboot_cpus().
> > >>> ---
> > >>>  kernel/sys.c | 22 +++++++++++++++++++---
> > >>>  1 file changed, 19 insertions(+), 3 deletions(-)
> > >>>
> > >>> diff --git a/kernel/sys.c b/kernel/sys.c
> > >>> index 0da73cf..5ef7aa2 100644
> > >>> --- a/kernel/sys.c
> > >>> +++ b/kernel/sys.c
> > >>> @@ -357,6 +357,22 @@ int unregister_reboot_notifier(struct notifier_block *nb)
> > >>>  }
> > >>>  EXPORT_SYMBOL(unregister_reboot_notifier);
> > >>>  
> > >>> +void migrate_to_reboot_cpu(void)
> > >>
> > >> It appears to be file-scope, so should be static I guess?
> > > 
> > > Done.
> > > 
> > >>> +{
> > >>> +	/* The boot cpu is always logical cpu 0 */
> > >>> +	int reboot_cpu_id = 0;
> > >>> +
> > >>> +	/* Make certain the cpu I'm about to reboot on is online */
> > >>> +	if (!cpu_online(reboot_cpu_id))
> > >>> +		reboot_cpu_id = smp_processor_id();
> > >>
> > >> Shouldn't we pick the first online CPU instead, to make it deterministic?
> > > 
> > > Done.
> > > 
> > > 		reboot_cpu_id = cpumask_first(cpu_online_mask);
> > > 
> > 
> > Let me ask again: if CPU 0 (or whatever the preferred reboot cpu is)
> > is offline, then why should we even bother pinning the task to (another)
> > CPU? Why not just proceed with the reboot?
> 
> No idea.  I copied it from the arch/x86 code.  I can not defend it.

I'd say it's a quality of implementation improvement if the choice of the CPU is 
deterministic, as long as the current configuration of CPUs is deterministic.

I.e. instead of 'reboot on the first CPU, or a random CPU', make the rule 'reboot 
on the first online CPU'. That's a simple rule to think about.

( On most architectures CPU#0 cannot be unplugged, so the rule will effectively be 
  'reboot on CPU#0'. Like the current upstream behavior. )

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ