linux-kernel - Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080826075355.GA7596@elte.hu>
Date:	Tue, 26 Aug 2008 09:53:55 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	David Miller <davem@...emloft.net>
Cc:	torvalds@...ux-foundation.org, Alan.Brunelle@...com,
	travis@....com, tglx@...utronix.de, rjw@...k.pl,
	linux-kernel@...r.kernel.org, kernel-testers@...r.kernel.org,
	akpm@...ux-foundation.org, arjan@...ux.intel.com,
	rusty@...tcorp.com.au
Subject: Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c -
	bisected

* David Miller <davem@...emloft.net> wrote:

> From: Ingo Molnar <mingo@...e.hu>
> Date: Tue, 26 Aug 2008 09:22:20 +0200
> 
> > And i guess the next generation of 4K CPUs support should just get away 
> > from cpumask_t-on-kernel-stack model altogether, as the current model is 
> > not maintainable. We tried the on-kernel-stack variant, and it really 
> > does not work reliably. We can fix this in v2.6.28.
> 
> I recenetly did some work on sparc64 to use cpumask pointers as much 
> as possible.
> 
> The only case that didn't work was due to a limitation in arch 
> interfaces for the new generic smp_call_function() code. It passes a 
> cpumask_t instead of a pointer to one via 
> arch_send_call_function_ipi().
> 
> But other than that, the whole sparc64 SMP stuff uses cpumask_t 
> pointers only.

nice!

> What it comes down to is that you have to do the "self cpu" and other 
> tests in the cross-call dispatch routines themselves, instead of at 
> the top-level working on cpumask_t objects.
> 
> Otherwise you have to modify cpumask_t objects and thus pluck them 
> onto the stack where they take up silly amounts of space.

What we did was this: we added MAXSMP which just revs up all the SMP 
tunables to the maximum, so that we can see any problems early in 
testing.

And we triggered problems, and we fixed a couple of regressions all 
around stack footprint. But we didnt catch all of them - some were gcc 
version dependent and configuration dependent. So i think it's safe to 
say that the whole concept of allowing such a large cpumask_t to be on 
the stack is fragile.

Hence, i think the best way forward is to change the whole cpumask_t 
concept and disallow explicit masks altogether. It's so easy to smack a 
cpumask_t variable on the stack and nothing really warns about it, and 
any function can become part of a nested call sequence.

So i think the dynamics of it has to be changed: we need a get/put API 
and we need to make on-stack cpumask illegal on the build level (in 
generic code at least). This has been Rusty's main argument early on i 
think, and i now concur.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/