linux-kernel - Re: [PATCH] idr: Use this_cpu_ptr() for percpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 21 Aug 2013 07:59:41 -0400
From:	Tejun Heo <tj@...nel.org>
To:	Kent Overstreet <kmo@...erainc.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>,
	Christoph Lameter <cl@...two.org>,
	linux-kernel@...r.kernel.org, Oleg Nesterov <oleg@...hat.com>,
	Ingo Molnar <mingo@...hat.com>,
	Andi Kleen <andi@...stfloor.org>, Jens Axboe <axboe@...nel.dk>
Subject: Re: [PATCH] idr: Use this_cpu_ptr() for percpu_ida

Hello, Kent.

On Tue, Aug 20, 2013 at 07:31:51PM -0700, Kent Overstreet wrote:
> All this for a performance improvement of 10x to 50x (or more), for the
> ida sizes I measured.

That's misleading, isn't it?  We should see large performance
improvements even without the large pages.  What matters more is the
leaf node performance for vast majority of cases and an extra radix
tree layer on top would cover most of whatever is left.  Whether to
use high order pages or not only affects the extreme niche use cases
and I don't think going for high order pages to micro optimize those
extreme use cases is the right trade off.

> So I could see your point if we were allocating gobs of vmalloc memory,
> or high order allocations big enough to realistically be problematic (I
> honestly don't think these will be) - but to me, this seems like a
> pretty reasonable tradeoff for those performance gains.

The trade off is made up as the bulk of the performance benefit can be
gained without resorting to high order allocations.

> (And the performance gains do substantially come from using more
> contiguous memory and treating the whole data structure as an array, and
> doing less pointer chasing/looping)

I really have hard time buying that.  Let's say you go with single
page leaf node and an extra single page layer on top.  How many IDs
are we talking about?  For the cases which are most performance
sensitive, this doesn't even matter a bit as percpu caching layer
would be on top anyway.  I really don't think the micro optimization
is called for at the cost of high order allocations from low level
tool library.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/