linux-kernel - Re: [PATCH] [patch 4a/4] ipc: sem optimise simple operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090816045316.GA11115@wotan.suse.de>
Date:	Sun, 16 Aug 2009 06:53:16 +0200
From:	Nick Piggin <npiggin@...e.de>
To:	Manfred Spraul <manfred@...orfullife.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Nadia Derbey <Nadia.Derbey@...l.net>,
	Pierre Peiffer <peifferp@...il.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] [patch 4a/4] ipc: sem optimise simple operations

On Sat, Aug 15, 2009 at 06:32:14PM +0200, Manfred Spraul wrote:
> On 08/15/2009 04:49 PM, Nick Piggin wrote:
> >
> >I don't see how you've argued that yours is better.
> >
> >   

OK, but I'll add some context too.

> Lower number of new code lines,

Downside is you further complicate the already complex path.

> Lower total code size increase.

Downside is simple ops run in the complex path too so icache
footprint could be higher

> Lower number of seperate codepaths.

But they are independently all simpler. Combining them doesn't
add copmlexity together.

> Lower runtime memory consumption.

I'll fix this with hlists.

> Two seperate patches for the two algorithm improvements.
> 
> The main advantage of your version is that you optimize more cases.

And it is simpler.

> >If you are worried about memory consumption, we can add _rcu variants
> >to hlists and use them.
> There is no need for _rcu, the whole code runs under a spinlock.

Ah, yeah I was thinking of the undo list I think. Great then that
wil lbe easy.

> Thus the wait_for_zero queue could be converted to a hlist immediately.

Both queues can be.

> 
> Hmm: Did you track my proposals for your version?
> 
> - exit_sem() is not a hot path.
> I would propose to tread every exit_sem as update_queue, not an 
> update_queue_simple for every individual UNDO.

I don't think it matters too much, but ok. 

> - create an unlink_queue() helper that contains the updates to q->lists 
> and sma->complex_count.
> Three copies ask for errors.

Yes this is a good idea.

> - now: use a hlist for the zero queue.
> 
> >  And if you are worried about text size, then
> >I would bet my version actually uses less icache in the case of
> >simple ops being used.
> >   
> It depends. After disabling inlining, including all helper functions 
> that differ:
> 
> My proposal: 301 bytes for update_queue.
> 
> "simple", only negv: 226 bytes
> "simple, negv+zero: 354 bytes
> simple+complex: 526 bytes.
> 
> Thus with only +-1 simple ops, your version uses less icache. If both 
> +-1 and 0 ops are used, your version uses more icache.

I'll get rid of some of the BUG_ONs too, they're mostly there just
to verify correctness when I was developing it.

> Could you please send me your benchmark app?

Yeah I'll dig it out. It iterally was just lock, spin, unlock with
lots of processes.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/