linux-kernel - Re: Kernel Development & Objective-C

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 3 Dec 2007 12:50:10 +0100
From:	Andi Kleen <andi@...stfloor.org>
To:	Avi Kivity <avi@...o.co.il>
Cc:	Andi Kleen <andi@...stfloor.org>,
	Kyle Moffett <mrmacman_g4@....com>,
	Lennart Sorensen <lsorense@...lub.uwaterloo.ca>,
	Ben Crowhurst <Ben.Crowhurst@...llatravel.co.uk>,
	linux-kernel@...r.kernel.org
Subject: Re: Kernel Development & Objective-C

On Mon, Dec 03, 2007 at 01:46:45PM +0200, Avi Kivity wrote:
> If you have 10M packets/sec no amount of cycle-saving will help you.  
> You need high level optimizations like TSO.  I'm not saying we should 
> sacrifice cycles like there's no tomorrow, but the big wins are elsewhere.

Both high and low level optimizations are needed for good performance.

> >Similar with highend routing or in some latency sensitive network
> >applications (e.g. in HPC). 
> 
> True.  And here, the hardware can cut hundreds of cycles by avoiding the 
> kernel completely for the fast path.

A lot of applications don't and the user space networking schemes
tend to have their own drawbacks anyways.

> >Another simple noticeable case is Unix
> >sockets and your X server communication.
> 
> Your reflexes are *much* better than mine if you can measure half a 
> nanosecond on X.

That's not about mouse/keyboard input, but about all X protocol communication
between X clients and X server. The key is not large copies here 
anyways (large data is put into shm) but latency.

> And again the key is batching, improving cpu affinity, and caching, not 
> looking for a faster instruction sequence.

That's not the whole story no. Batching etc are needed, but the
faster instruction sequences are needed too. 

> Nanooptimizations are fun (I do them myself, I admit) but that's not 
> where performance as measured by the end user lies.

It depends. Often high level (and then caching) optimizations are better 
bang for the buck, but completely disregarding the fast path work is a bad 
thing too. As an example see Christoph's recent work on the slub fastpath
which makes a quite measurable difference on benchmarks.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/