linux-kernel - Re: [PATCH] idr: Document ida tree sections

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130815002251.GP28628@htj.dyndns.org>
Date:	Wed, 14 Aug 2013 20:22:51 -0400
From:	Tejun Heo <tj@...nel.org>
To:	Kent Overstreet <kmo@...erainc.com>
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	Stephen Rothwell <sfr@...b.auug.org.au>,
	Fengguang Wu <fengguang.wu@...el.com>
Subject: Re: [PATCH] idr: Document ida tree sections

Hello, Kent.

On Wed, Aug 14, 2013 at 05:04:27PM -0700, Kent Overstreet wrote:
> I was just telling you how I felt :) Regardless of that, IMO what I've
> got now is superior to any radix tree based approach for what ida/idr
> are supposed to do. I could of course be wrong, but I'm not convinced...

It's just very difficult to tell either way.  You say it's better but
the benefit to weirdity ratio doesn't seem too apparent.  The only
thing the proposed solution saves is a few pointer dereferences in
extreme corner cases at the cost of making low level library using
high order allocation or vmalloc allocation.

Weirdity aside, the unsualness even makes evaluating the overhead
muddier.  e.g. vmalloc space is expensive not only in terms of address
space real estate but also in terms of runtime performance because
each vmalloc page is additional TLB pressure in most configurations
where the kernel linear address space is mapped with gigantic
mappings.  The net effect could be small and won't easily show up in
microbenchmarks as they usually don't tend to push TLB pressure but
then again the performance benefit of the proposed implementation is
likely to extremely minor too.

For a code piece to be unusual, it should have its accompanying clear
benefits, which doesn't seem to be the case here.  It's different and
maybe better in some extreme benchmarks specifically designed for it
but that seems to be about it.

> Re: caching the last allocation with a radix tree based implementation.
> I thought about that more last night, I don't think that would be viable
> for using ida underneath the percpu ida allocator.
>
> Reason being percpu ida has to heavily optimize for the case where
> almost all of the id space is allocated, and after awhile the id space
> is going to be fragmented - so caching the last allocated id is going to
> be useless.

A 4k page has 32k bits.  It can serve up quite a few IDs even with
internal indexing.  Most cases will be fine with single page and
single layer would cover most of what's left.  How is that gonna be
very different from the proposed implementation.  If you worry about
huge ID space being distributed by a lot of CPUs, you can use per-cpu
hints, which will be faster than the proposed solution anyway.

> This is also why I implemented the ganged allocation bit, to amortize
> the bitmap tree traversal. So we'd lose out on that going back to a
> radix tree, or have to reimplement it (and it'll be slower due to
> pointer chasing).
> 
> Which is all not the end of the world, but it means that if we punt on
> the ida/idr rewrites for now or change our minds about them - we _do_
> have quite a bit of stuff waiting on the percpu ida allocator, so for
> that to go in separate I'll have to change it back to using a stack of
> integers for the global freelist - which does use significantly more
> memory.

Yes, I'd like to see better, percpu-aware ida too and there are things
which can benefit from it, but we still need to get ida right and I
don't think it's a very difficult thing to do at this point.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/