lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 3 Dec 2007 12:50:10 +0100
From:	Andi Kleen <andi@...stfloor.org>
To:	Avi Kivity <avi@...o.co.il>
Cc:	Andi Kleen <andi@...stfloor.org>,
	Kyle Moffett <mrmacman_g4@....com>,
	Lennart Sorensen <lsorense@...lub.uwaterloo.ca>,
	Ben Crowhurst <Ben.Crowhurst@...llatravel.co.uk>,
	linux-kernel@...r.kernel.org
Subject: Re: Kernel Development & Objective-C

On Mon, Dec 03, 2007 at 01:46:45PM +0200, Avi Kivity wrote:
> If you have 10M packets/sec no amount of cycle-saving will help you.  
> You need high level optimizations like TSO.  I'm not saying we should 
> sacrifice cycles like there's no tomorrow, but the big wins are elsewhere.

Both high and low level optimizations are needed for good performance.

> >Similar with highend routing or in some latency sensitive network
> >applications (e.g. in HPC). 
> 
> True.  And here, the hardware can cut hundreds of cycles by avoiding the 
> kernel completely for the fast path.

A lot of applications don't and the user space networking schemes
tend to have their own drawbacks anyways.

> >Another simple noticeable case is Unix
> >sockets and your X server communication.
> 
> Your reflexes are *much* better than mine if you can measure half a 
> nanosecond on X.

That's not about mouse/keyboard input, but about all X protocol communication
between X clients and X server. The key is not large copies here 
anyways (large data is put into shm) but latency.

> And again the key is batching, improving cpu affinity, and caching, not 
> looking for a faster instruction sequence.

That's not the whole story no. Batching etc are needed, but the
faster instruction sequences are needed too. 

> Nanooptimizations are fun (I do them myself, I admit) but that's not 
> where performance as measured by the end user lies.

It depends. Often high level (and then caching) optimizations are better 
bang for the buck, but completely disregarding the fast path work is a bad 
thing too. As an example see Christoph's recent work on the slub fastpath
which makes a quite measurable difference on benchmarks.


-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ