linux-kernel - Re: pthread_create() slow for many threads; also time to revisit 64b context switch optimization?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080815124350.GA26594@elte.hu>
Date:	Fri, 15 Aug 2008 14:43:50 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Ulrich Drepper <drepper@...hat.com>,
	Arjan van de Ven <arjan@...radead.org>,
	akpm@...ux-foundation.org, hugh@...itas.com, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, briangrant@...gle.com,
	cgd@...gle.com, mbligh@...gle.com,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: pthread_create() slow for many threads; also time to revisit
	64b context switch optimization?

* Andi Kleen <andi@...stfloor.org> wrote:

> Ingo Molnar <mingo@...e.hu> writes:
> >
> > i find it pretty unacceptable these days that we limit any aspect of 
> > pure 64-bit apps in any way to 4GB (or any other 32-bit-ish limit). 
> 
> It's not limited to 2GB, there's a fallback to >4GB of course. Ok 
> admittedly the fallback is slow, but it's there.

Of course - what you are missing is that _10 milliseconds_ thread 
creation overhead is completely unacceptable overhead: it is so bad as 
if we didnt even support it.

> I would prefer to not slow down the P4s. There are **lots** of them in 
> field. And they ran 64bit still quite well. [...]

Nonsense, i had such a P4 based 64-bit box and it was painful. Everyone 
with half a brain used them as 32-bit machines. Nor is the 
context-switch overhead in any way significant. Plus, as Arjan mentioned 
it, only the earliest P4 64-bit CPUs had this problem.

> [...] Also back then I benchmarked on early K8 and it also made a 
> difference there (but I admit I forgot the numbers)

that's a lot of handwaving with no actual numbers. The numbers in this 
discussion show that the context-switch overhead is small and that the 
overhead on perfectly good systems that hit this limit is obscurely 
high.

I'd love to zap MAP_32BIT this very minute from the kernel, but you 
originally shaped the whole thing in such a stupid way that makes its 
elimination impossible now due to ABI constraints. It would have cost 
you _nothing_ to have added MAP_64BIT_STACK back then, but the quick & 
sloppy solution was to reuse MAP_32BIT for 64-bit tasks. And you are 
stupid about it even now. Bleh.

The correct solution is to eliminate this flag from glibc right now, and 
maybe add the MAP_64BIT_STACK flag as well, as i posted it - if anyone 
with such old boxes still cares (i doubt anyone does). That flag then 
will take its usual slow route. Ulrich?

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/