linux-kernel - Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <48C54925.8040409@sgi.com>
Date:	Mon, 08 Sep 2008 08:47:49 -0700
From:	Mike Travis <travis@....com>
To:	Nick Piggin <nickpiggin@...oo.com.au>
CC:	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jack Steiner <steiner@....com>, Jes Sorensen <jes@....com>,
	David Miller <davem@...emloft.net>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask

Nick Piggin wrote:
> On Sunday 07 September 2008 04:12, Mike Travis wrote:
>> Ingo Molnar wrote:
>>> * Mike Travis <travis@....com> wrote:
>>>>   * Cleanup cpumask_t usages in smp_call_function_mask function chain
>>>>     to prevent stack overflow problem when NR_CPUS=4096.
>>>>
>>>>   * Reduce the number of passed cpumask_t variables in the following
>>>>     call chain for x86_64:
>>>>
>>>> 	smp_call_function_mask -->
>>>> 	    arch_send_call_function_ipi->
>>>> 		    smp_ops.send_call_func_ipi -->
>>>> 			    genapic->send_IPI_mask
>>>>
>>>>     Since the smp_call_function_mask() is an EXPORTED function, we
>>>>     cannot change it's calling interface for a patch to 2.6.27.
>>>>
>>>>     The smp_ops.send_call_func_ipi interface is internal only and
>>>>     has two arch provided functions:
>>>>
>>>> 	arch/x86/kernel/smp.c:  .send_call_func_ipi = native_send_call_func_ipi
>>>> 	arch/x86/xen/smp.c:     .send_call_func_ipi =
>>>> xen_smp_send_call_function_ipi arch/x86/mach-voyager/voyager_smp.c:   
>>>> (uses native_send_call_func_ipi)
>>>>
>>>>     Therefore modifying the internal interface to use a cpumask_t
>>>> pointer is straight-forward.
>>>>
>>>>     The changes to genapic are much more extensive and are affected by
>>>> the recent additions of the x2apic modes, so they will be done for
>>>> 2.6.28 only.
>>>>
>>>> Based on 2.6.27-rc5-git6.
>>>>
>>>> Applies to linux-2.6.tip/master (with FUZZ).
>>> applied to tip/cpus4096, thanks Mike.
>> Thanks Ingo!  Could you send me the git id for the merge?
>>
>>> I'm still wondering whether we should get rid of non-reference based
>>> cpumask_t altogether ...
>> I've got a whole slew of "get-ready-to-remove-cpumask_t's" coming soon.
>> There are two phases, one completely within the x86 arch and the 2nd hits
>> the generic smp_call_function_mask ABI (won't be doable as a back-ported
>> patch to 2.6.27.)
>>
>>> Did you have a chance to look at the ftrace/stacktrace tracer in latest
>>> tip/master, which will show the maximum stack footprint that can occur?
>> Hmm, no.  I'm using a default config right now as I can boot that pretty
>> easily.  I'll turn on the ftrace thing and check it out.
>>
>>> Also, i've applied the patch below as well to restore MAXSMP in a muted
>>> form - with big warning signs added as well.
>> The main thing is to allow the distros to set it manually for their QA
>> testing of 2.6.27.  I'm sure I'll get back bugs because of just that.
>>
>> (Is there a way to have them know to assign bugzilla's to me if NR_CPUS=4k
>> is the root of the problem?  This is an extremely serious issue for SGI
>> and I'd like to avoid any delays in me finding out about problems.)
> 
> Considering that, unless I'm mistaken, you want to run production systems
> with 4096 CPUs at some point, then I would say you should really consider
> increasing NR_CPUS _further_ than that in QA efforts, so that we might be
> a bit more confident of running production kernels with 4096.
> 
> Is that being tried? Setting it to 8192 or even higher during QA seems
> like a good idea to me.


That's a good idea.  I do occasionally set it to 16k (and 64k) for experimental
reasons (and to really highlight where cpumask_t space hogs reside), but I
hadn't thought to do it in the QA environment.

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/