linux-kernel - Re: [PATCH] [patch 4a/4] ipc: sem optimise simple operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A86E30E.8030208@colorfullife.com>
Date:	Sat, 15 Aug 2009 18:32:14 +0200
From:	Manfred Spraul <manfred@...orfullife.com>
To:	Nick Piggin <npiggin@...e.de>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Nadia Derbey <Nadia.Derbey@...l.net>,
	Pierre Peiffer <peifferp@...il.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] [patch 4a/4] ipc: sem optimise simple operations

On 08/15/2009 04:49 PM, Nick Piggin wrote:
>
> I don't see how you've argued that yours is better.
>
>    
Lower number of new code lines,
Lower total code size increase.
Lower number of seperate codepaths.
Lower runtime memory consumption.
Two seperate patches for the two algorithm improvements.

The main advantage of your version is that you optimize more cases.
> If you are worried about memory consumption, we can add _rcu variants
> to hlists and use them.
There is no need for _rcu, the whole code runs under a spinlock.
Thus the wait_for_zero queue could be converted to a hlist immediately.

Hmm: Did you track my proposals for your version?

- exit_sem() is not a hot path.
I would propose to tread every exit_sem as update_queue, not an 
update_queue_simple for every individual UNDO.

- create an unlink_queue() helper that contains the updates to q->lists 
and sma->complex_count.
Three copies ask for errors.

- now: use a hlist for the zero queue.

>   And if you are worried about text size, then
> I would bet my version actually uses less icache in the case of
> simple ops being used.
>    
It depends. After disabling inlining, including all helper functions 
that differ:

My proposal: 301 bytes for update_queue.

"simple", only negv: 226 bytes
"simple, negv+zero: 354 bytes
simple+complex: 526 bytes.

Thus with only +-1 simple ops, your version uses less icache. If both 
+-1 and 0 ops are used, your version uses more icache.

Could you please send me your benchmark app?

--
     Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/