linux-kernel - Re: Question about x86/mm/gup.c's use of disabled interrupts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <49C21473.2000702@redhat.com>
Date:	Thu, 19 Mar 2009 11:46:27 +0200
From:	Avi Kivity <avi@...hat.com>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
CC:	Nick Piggin <nickpiggin@...oo.com.au>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linux Memory Management List <linux-mm@...ck.org>,
	Xen-devel <xen-devel@...ts.xensource.com>,
	Jan Beulich <jbeulich@...ell.com>, Ingo Molnar <mingo@...e.hu>
Subject: Re: Question about x86/mm/gup.c's use of disabled interrupts

Jeremy Fitzhardinge wrote:
>>>
>>> Well, no, not deferring.  Making xen_flush_tlb_others() spin waiting 
>>> for "doing_gup" to clear on the target cpu.  Or add an explicit 
>>> notion of a "pte update barrier" rather than implicitly relying on 
>>> the tlb IPI (which is extremely convenient when available...).
>>
>> Pick up a percpu flag from all cpus and spin on each?  Nasty.
>
> Yeah, not great.  Each of those flag fetches is likely to be cold, so 
> a bunch of cache misses.  The only mitigating factor is that cross-cpu 
> tlb flushes are expected to be expensive, but some workloads are 
> apparently very sensitive to extra latency in that path.  

Right, and they'll do a bunch more cache misses, so in comparison it 
isn't too bad.

> And the hypercall could result in no Xen-level IPIs at all, so it 
> could be very quick by comparison to an IPI-based Linux 
> implementation, in which case the flag polling would be particularly 
> harsh.

Maybe we could bring these optimizations into Linux as well.  The only 
thing Xen knows that Linux doesn't is if a vcpu is not scheduled; all 
other information is shared.

>
> Also, the straightforward implementation of "poll until all target 
> cpu's flags are clear" may never make progress, so you'd have to "scan 
> flags, remove busy cpus from set, repeat until all cpus done".
>
> All annoying because this race is pretty unlikely, and it seems a 
> shame to slow down all tlb flushes to deal with it.  Some kind of 
> global "doing gup_fast" counter would get flush_tlb_others bypass the 
> check, at the cost of putting a couple of atomic ops around the 
> outside of gup_fast.

The nice thing about local_irq_disable() is that it scales so well.

>
>> You could use the irq enabled flag; it's available and what native 
>> spins on (but also means I'll need to add one if I implement this).
>
> Yes, but then we'd end up spuriously polling on cpus which happened to 
> disable interrupts for any reason.  And if the vcpu is not running 
> then we could end up polling for a long time.  (Same applies for 
> things in gup_fast, but I'm assuming that's a lot less common than 
> disabling interrupts in general).

Right.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/