linux-kernel - Re: [PATCH 0/4] radix-tree: iterating general cleanup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 08 Feb 2012 05:30:35 +0400
From:	Konstantin Khlebnikov <khlebnikov@...nvz.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Hugh Dickins <hughd@...gle.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/4] radix-tree: iterating general cleanup

Linus Torvalds wrote:
> On Mon, Feb 6, 2012 at 11:54 PM, Konstantin Khlebnikov
> <khlebnikov@...nvz.org>  wrote:
>> This patchset implements common radix-tree iteration routine and
>> reworks page-cache lookup functions with using it.
>
> So what's the advantage? Both the line counts and the bloat-o-meter
> seems to imply this is all just bad.

If do not count comments here actually is negative line count change.
And if drop (almost) unused radix_tree_gang_lookup_tag_slot() and
radix_tree_gang_lookup_slot() total bloat-o-meter score becomes negative too.
There also some shrinkable stuff in shmem code.
So, if this is really a problem it is fixable. I just didn't want to bloat patchset.

>
> I assume there is some upside to it, but you really don't make that
> obvious, so why should anybody ever waste even a second of time
> looking at this?

Hmm, from my point of view, this unified iterator code looks cleaner than
current duplicated radix-tree code. That's why I was titled it "cleanup".

There also some simple bit-hacks: find-next-bit instead of dumb loops in tagged-lookup.

Here some benchmark results: there is radix-tree with 1024 slots, I fill and tag every <step> slot,
and run lookup for all slots with radix_tree_gang_lookup() and radix_tree_gang_lookup_tag() in the loop.
old/new rows -- nsec per iteration over whole tree.

tagged-lookup
step	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
old	7035	5248	4742	4308	4217	4133	4030	3920	4038	3933	3914	3796	3851	3755	3819	3582
new	3578	2617	1899	1426	1220	1058	936	822	845	749	695	679	648	575	591	509

so, new tagged-lookup always faster, especially for sparse trees.

normal-lookup
step	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
old	4156	2641	2068	1837	1630	1531	1467	1415	1373	1398	1333	1323	1308	1280	1236	1249
new	3048	2520	2429	2280	2266	2296	2215	2219	2188	2170	2218	2332	2166	2096	2100	2058

New normal lookup works faster for dense trees, on sparse trees it slower.
Looks like it caused by struct radix_tree_iter aliasing,
gcc cannot effectively use its field as loop counter in nested-loop.
But how important is this? Anyway, I'll think how to fix this misfortune.

>
>                    Linus

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/