linux-kernel - Re: [PATCH] idr: Use this_cpu_ptr() for percpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 20 Aug 2013 19:31:51 -0700
From:	Kent Overstreet <kmo@...erainc.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>,
	Christoph Lameter <cl@...two.org>,
	linux-kernel@...r.kernel.org, Oleg Nesterov <oleg@...hat.com>,
	Ingo Molnar <mingo@...hat.com>,
	Andi Kleen <andi@...stfloor.org>, Jens Axboe <axboe@...nel.dk>
Subject: Re: [PATCH] idr: Use this_cpu_ptr() for percpu_ida

On Tue, Aug 20, 2013 at 10:07:42PM -0400, Tejun Heo wrote:
> Hello, Kent.
> 
> On Tue, Aug 20, 2013 at 07:01:32PM -0700, Kent Overstreet wrote:
> > I think Tejun and I might be at a bit of an impasse with the ida rewrite
> > itself, but I don't think there were any outstanding objections to the
> > percpu ida code itself - and this is a standalone version.
> 
> The percpu ida code can be applied separately from the ida rewrite?

Yes - at the cost of using significantly more memory for the global
freelist

> > I was meaning to ask you Andrew, if you could take a look at the ida
> > discussion and lend your opinion - I don't think there's any _specific_
> > technical objections left to my ida code, and it's now on a more
> > philisophical "complexity vs. ..." level.
> 
> Hmmm... the objection was pretty specific - don't depend on high-order
> or vmalloc allocations when it can be easily avoided by using proper
> radix tree.

We only do vmalloc allocations for CONFIG_COMPACTION=n, and then only
when we need to allocate more than almost 1 _billion_ ids from a single
ida (twice than on 32 bit, so never because that gets us just about to
INT_MAX) - and then it's only 32k of vmalloc memory, for the entire ida.

This is with max allocation order of 4 for COMPACTION=y, 2 for
COMPACTION=n. 

All this for a performance improvement of 10x to 50x (or more), for the
ida sizes I measured.

So I could see your point if we were allocating gobs of vmalloc memory,
or high order allocations big enough to realistically be problematic (I
honestly don't think these will be) - but to me, this seems like a
pretty reasonable tradeoff for those performance gains.

(And the performance gains do substantially come from using more
contiguous memory and treating the whole data structure as an array, and
doing less pointer chasing/looping)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/