lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 21 Aug 2013 14:09:01 -0700 From: Kent Overstreet <kmo@...erainc.com> To: Tejun Heo <tj@...nel.org> Cc: Andrew Morton <akpm@...ux-foundation.org>, "Nicholas A. Bellinger" <nab@...ux-iscsi.org>, Christoph Lameter <cl@...two.org>, linux-kernel@...r.kernel.org, Oleg Nesterov <oleg@...hat.com>, Ingo Molnar <mingo@...hat.com>, Andi Kleen <andi@...stfloor.org>, Jens Axboe <axboe@...nel.dk> Subject: Re: [PATCH] idr: Use this_cpu_ptr() for percpu_ida On Wed, Aug 21, 2013 at 07:59:41AM -0400, Tejun Heo wrote: > Hello, Kent. > > On Tue, Aug 20, 2013 at 07:31:51PM -0700, Kent Overstreet wrote: > > All this for a performance improvement of 10x to 50x (or more), for the > > ida sizes I measured. > > That's misleading, isn't it? It's comparing it to the existing version that actually exists, instead of comparing it to a hypothetical approach that doesn't exist yet. I don't see how that's misleading. > We should see large performance > improvements even without the large pages. What matters more is the > leaf node performance for vast majority of cases and an extra radix > tree layer on top would cover most of whatever is left. Whether to > use high order pages or not only affects the extreme niche use cases > and I don't think going for high order pages to micro optimize those > extreme use cases is the right trade off. > > > So I could see your point if we were allocating gobs of vmalloc memory, > > or high order allocations big enough to realistically be problematic (I > > honestly don't think these will be) - but to me, this seems like a > > pretty reasonable tradeoff for those performance gains. > > The trade off is made up as the bulk of the performance benefit can be > gained without resorting to high order allocations. I'm more and more skeptical that that's true, and it's a given that it wouldn't be as fast. > > (And the performance gains do substantially come from using more > > contiguous memory and treating the whole data structure as an array, and > > doing less pointer chasing/looping) > > I really have hard time buying that. Let's say you go with single > page leaf node and an extra single page layer on top. How many IDs > are we talking about? For the cases which are most performance > sensitive, this doesn't even matter a bit as percpu caching layer > would be on top anyway. I really don't think the micro optimization > is called for at the cost of high order allocations from low level > tool library. These "micro optimizations" mean either less pointer chasing or less branching in the _common_ case; you'd trade common case performance for avoiding ever doing higher order allocations (and 2 with COMPACTION=n and 4 with COMPACTION=y is not particularly high order!). I don't buy that that's a good tradeoff. If you're convinced radix trees are the way to go and it can be done without much performance cost, why not code it up and show us? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists