linux-kernel - Re: [patch] mm: reduce pagetable-freeing latencies

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 25 Jul 2007 08:44:10 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc:	Andi Kleen <andi@...stfloor.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
	Hugh Dickins <hugh@...itas.com>
Subject: Re: [patch] mm: reduce pagetable-freeing latencies

On Wed, 2007-07-25 at 07:29 +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2007-07-24 at 14:13 +0200, Andi Kleen wrote:
> > Benjamin Herrenschmidt <benh@...nel.crashing.org> writes:
> > 
> > > > What a truly putrid patch.  I am suspecting that this was a quick
> > > > get-you-out-of-trouble thing, which then got forgotten about.
> > > > 
> > > > We have two months to do the "right fix".  Please?
> > > 
> > > Working on it... 
> > 
> > Ideally the patch would DTRT even on non preemptible kernels,
> > aka do cond_resched()s when needed.
> 
> First is to rework the batch structure to make it more manageable. That
> is, patch #1 will keep the page list in per-cpu (and thus non-preempt),
> but the batch "head" will be on the stack.
> 
> Now, there are two approaches regarding getting rid of the
> get_cpu/put_cpu:
> 
>  - One is to have a small number of entries for the page list in the
> batch structure on the stack, and attempt to gfp' a page for more. If
> that fails, we can still free, though with less batching, using only the
> few entries in the batch struct itself. That's Hugh initial appraoch
> iirc.
> 
>  - Another is to hook up with those folks who've been asking for a
> notifier that we are being preempted/scheduled out. In this case, I can
> happily access the per-cpu list, and just trigger a batch flush if we
> happen to be scheduled out.
> 
> I tend to prefer the former solution though, gfp should be fast, and
> there is no need to force a flush if we get scheduled out. It would be
> rare to hit the worst case scenario of falling back to the few page
> heads in the batch itself. On the other hand, that solution has the
> problem of bloating the stack a bit (with the few page pointers) even in
> the case where I plan to use the extended batch outside of zap_*, such
> as fork, mprotect, ....
> 
> So I'll first do patch #1, which will not fix the problem, but will make
> the fix easier to fit in, in the meantime, please provide feedback of
> your preferred solution for avoiding the get/put_cpu of the 2 above,
> unless you find a good 3rd one.

I too would prefer the former solution. I think preemption notifiers are
a particular iffy hack.

You could perhaps use C99 variable length arrays to avoid the stack
waste when not needed, however Andi once told me that generates rather
dubious code.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/