lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANRm+CxCMNht4JMC4e5qhQVOjj3z2P8ZZbc1fTzX=v_a-HDD4w@mail.gmail.com>
Date:   Tue, 14 Nov 2017 14:10:16 +0800
From:   Wanpeng Li <kernellwp@...il.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        kvm <kvm@...r.kernel.org>, Paolo Bonzini <pbonzini@...hat.com>,
        "Radim Kr??m????" <rkrcmar@...hat.com>,
        Wanpeng Li <wanpeng.li@...mail.com>
Subject: Re: [PATCH v4 2/4] KVM: X86: Add paravirt remote TLB flush

2017-11-13 21:02 GMT+08:00 Peter Zijlstra <peterz@...radead.org>:
> On Mon, Nov 13, 2017 at 11:46:34AM +0100, Peter Zijlstra wrote:
>> On Mon, Nov 13, 2017 at 04:26:57PM +0800, Wanpeng Li wrote:
>> > 2017-11-13 16:04 GMT+08:00 Peter Zijlstra <peterz@...radead.org>:
>>
>> > > So if at this point a vCPU gets preempted we'll still spin-wait for it,
>> > > which is sub-optimal.
>> > >
>> > > I think we can come up with something to get around that 'problem' if
>> > > indeed it is a problem. But we can easily do that as follow up patches.
>> > > Just let me know if you think its worth spending more time on.
>> >
>> > You can post your idea, it is always smart. :) Then we can evaluate
>> > the complexity and gains.
>>
>> I'm not sure I have a fully baked idea just yet, but the general idea
>> would be something like:
>>
>>  - switch (back) to a dedicated TLB invalidate IPI
>
> Just for PV that is; the !PV code can continue doing what it does today.
>
>>  - introduce KVM_VCPU_IPI_PENDING
>>
>>  - change flush_tlb_others() into something like:
>>
>>    for_each_cpu(cpu, flushmask) {
>>        src = &per_cpu(steal_time, cpu);
>>        state = READ_ONCE(src->preempted);
>>        do {
>>                if (state & KVM_VCPU_PREEMPTED) {
>>                        if (try_cmpxchg(&src->preempted, &state,
>>                                                state | KVM_VCPU_SHOULD_FLUSH)) {
>>                                __cpumask_clear_cpu(cpu, flushmask);
>>                                break;
>>                        }
>>                }
>>        } while (!try_cmpxchg(&src->preempted, &state,
>>                                state | KVM_VCPU_IPI_PENDING));
>
> That can be written like:
>
>         do {
>                 if (state & KVM_VCPU_PREEMPTED)
>                         new_state = state | KVM_VCPU_SHOULD_FLUSH;
>                 else
>                         new_state = state | KVM_VCPU_IPI_PENDING;
>         } while (!try_cmpxchg(&src->preempted, state, new_state);
>
>         if (new_state & KVM_VCPU_IPI_PENDING)

Should be new_state & KVM_VCPU_SHOULD_FLUSH I think.

Regards,
Wanpeng Li

>                 __cpumask_clear_cpu(cpu, flushmask);
>
>>    }
>>
>>    apic->send_IPI_mask(flushmask, CALL_TLB_VECTOR);
>>
>>    for_each_cpu(cpu, flushmask) {
>>        src = &per_cpu(steal_time, cpu);
>
>         /*
>          * The ACQUIRE pairs with the cmpxchg clearing IPI_PENDING,
>          * which is either the TLB IPI handler, or the VMEXIT path.
>          * It ensure that the invalidate happens-before.
>          */
>>        smp_cond_load_acquire(&src->preempted, !(VAL & KVM_VCPU_IPI_PENDING);
>>    }
>
> And here we wait for completion of the invalidate; but because of the
> VMEXIT change below, this will never stall on a !running vCPU.
>
> Note that PLE will not help (much) here, without this extra IPI_PENDING
> state and the VMEXIT transferring it to SHOULD_FLUSH this vCPU's progress
> will be held up until all vCPU's you've IPI'd will have ran the IPI
> handler, which in the worst case is still a very long time.
>
>>  - have the TLB invalidate handler do something like:
>>
>>    state = READ_ONCE(src->preempted);
>>    if (!(state & KVM_VCPU_IPI_PENDING))
>>          return;
>>
>>    local_flush_tlb();
>>
>>    do {
>>    } while (!try_cmpxchg(&src->preempted, &state,
>>                        state & ~KVM_VCPU_IPI_PENDING));
>
> That needs to be:
>
>         /*
>          * Clear KVM_VCPU_IPI_PENDING to 'complete' flush_tlb_others().
>          */
>         do {
>                 /*
>                  * VMEXIT could have cleared this for us, in which case
>                  * we're done.
>                  */
>                 if (!(state & KVM_VCPU_IPI_PENDING))
>                         return;
>
>         } while (!try_cmpxchg(&src->preempted, state,
>                                 state & ~KVM_VCPU_IPI_PENDING));
>
>>  - then at VMEXIT time do something like:
>>
>         /*
>          * If we have IPI_PENDING set at VMEXIT time, transfer it to
>          * SHOULD_FLUSH. Clearing IPI_PENDING here allows the
>          * flush_others() vCPU to continue while the SHOULD_FLUSH
>          * guarantees this vCPU will flush TLBs before it continues
>          * execution.
>          */
>
>>    state = READ_ONCE(src->preempted);
>>    do {
>>       if (!(state & KVM_VCPU_IPI_PENDING))
>>               break;
>>    } while (!try_cmpxchg(&src->preempted, state,
>>                        (state & ~KVM_VCPU_IPI_PENDING) |
>>                        KVM_VCPU_SHOULD_FLUSH));
>>
>>    and clear any possible pending TLB_VECTOR in the guest state to avoid
>>    raising that IPI spuriously on enter again.
>>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ