linux-kernel - Re: [Xen-devel] [PATCH v2 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6038042c-917f-d361-5d79-f0205152fe00@citrix.com>
Date:   Wed, 3 Jul 2019 18:43:52 +0100
From:   Andrew Cooper <andrew.cooper3@...rix.com>
To:     Nadav Amit <namit@...are.com>, Juergen Gross <jgross@...e.com>
CC:     Sasha Levin <sashal@...nel.org>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        kvm list <kvm@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "virtualization@...ts.linux-foundation.org" 
        <virtualization@...ts.linux-foundation.org>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Andy Lutomirski <luto@...nel.org>,
        "Paolo Bonzini" <pbonzini@...hat.com>,
        xen-devel <xen-devel@...ts.xenproject.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "K. Y. Srinivasan" <kys@...rosoft.com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>
Subject: Re: [Xen-devel] [PATCH v2 4/9] x86/mm/tlb: Flush remote and local
 TLBs concurrently

On 03/07/2019 18:02, Nadav Amit wrote:
>> On Jul 3, 2019, at 7:04 AM, Juergen Gross <jgross@...e.com> wrote:
>>
>> On 03.07.19 01:51, Nadav Amit wrote:
>>> To improve TLB shootdown performance, flush the remote and local TLBs
>>> concurrently. Introduce flush_tlb_multi() that does so. Introduce
>>> paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen
>>> and hyper-v are only compile-tested).
>>> While the updated smp infrastructure is capable of running a function on
>>> a single local core, it is not optimized for this case. The multiple
>>> function calls and the indirect branch introduce some overhead, and
>>> might make local TLB flushes slower than they were before the recent
>>> changes.
>>> Before calling the SMP infrastructure, check if only a local TLB flush
>>> is needed to restore the lost performance in this common case. This
>>> requires to check mm_cpumask() one more time, but unless this mask is
>>> updated very frequently, this should impact performance negatively.
>>> Cc: "K. Y. Srinivasan" <kys@...rosoft.com>
>>> Cc: Haiyang Zhang <haiyangz@...rosoft.com>
>>> Cc: Stephen Hemminger <sthemmin@...rosoft.com>
>>> Cc: Sasha Levin <sashal@...nel.org>
>>> Cc: Thomas Gleixner <tglx@...utronix.de>
>>> Cc: Ingo Molnar <mingo@...hat.com>
>>> Cc: Borislav Petkov <bp@...en8.de>
>>> Cc: x86@...nel.org
>>> Cc: Juergen Gross <jgross@...e.com>
>>> Cc: Paolo Bonzini <pbonzini@...hat.com>
>>> Cc: Dave Hansen <dave.hansen@...ux.intel.com>
>>> Cc: Andy Lutomirski <luto@...nel.org>
>>> Cc: Peter Zijlstra <peterz@...radead.org>
>>> Cc: Boris Ostrovsky <boris.ostrovsky@...cle.com>
>>> Cc: linux-hyperv@...r.kernel.org
>>> Cc: linux-kernel@...r.kernel.org
>>> Cc: virtualization@...ts.linux-foundation.org
>>> Cc: kvm@...r.kernel.org
>>> Cc: xen-devel@...ts.xenproject.org
>>> Signed-off-by: Nadav Amit <namit@...are.com>
>>> ---
>>>  arch/x86/hyperv/mmu.c                 | 13 +++---
>>>  arch/x86/include/asm/paravirt.h       |  6 +--
>>>  arch/x86/include/asm/paravirt_types.h |  4 +-
>>>  arch/x86/include/asm/tlbflush.h       |  9 ++--
>>>  arch/x86/include/asm/trace/hyperv.h   |  2 +-
>>>  arch/x86/kernel/kvm.c                 | 11 +++--
>>>  arch/x86/kernel/paravirt.c            |  2 +-
>>>  arch/x86/mm/tlb.c                     | 65 ++++++++++++++++++++-------
>>>  arch/x86/xen/mmu_pv.c                 | 20 ++++++---
>>>  include/trace/events/xen.h            |  2 +-
>>>  10 files changed, 91 insertions(+), 43 deletions(-)
>> ...
>>
>>> diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
>>> index beb44e22afdf..19e481e6e904 100644
>>> --- a/arch/x86/xen/mmu_pv.c
>>> +++ b/arch/x86/xen/mmu_pv.c
>>> @@ -1355,8 +1355,8 @@ static void xen_flush_tlb_one_user(unsigned long addr)
>>>  	preempt_enable();
>>>  }
>>>  -static void xen_flush_tlb_others(const struct cpumask *cpus,
>>> -				 const struct flush_tlb_info *info)
>>> +static void xen_flush_tlb_multi(const struct cpumask *cpus,
>>> +				const struct flush_tlb_info *info)
>>>  {
>>>  	struct {
>>>  		struct mmuext_op op;
>>> @@ -1366,7 +1366,7 @@ static void xen_flush_tlb_others(const struct cpumask *cpus,
>>>  	const size_t mc_entry_size = sizeof(args->op) +
>>>  		sizeof(args->mask[0]) * BITS_TO_LONGS(num_possible_cpus());
>>>  -	trace_xen_mmu_flush_tlb_others(cpus, info->mm, info->start, info->end);
>>> +	trace_xen_mmu_flush_tlb_multi(cpus, info->mm, info->start, info->end);
>>>    	if (cpumask_empty(cpus))
>>>  		return;		/* nothing to do */
>>> @@ -1375,9 +1375,17 @@ static void xen_flush_tlb_others(const struct cpumask *cpus,
>>>  	args = mcs.args;
>>>  	args->op.arg2.vcpumask = to_cpumask(args->mask);
>>>  -	/* Remove us, and any offline CPUS. */
>>> +	/* Flush locally if needed and remove us */
>>> +	if (cpumask_test_cpu(smp_processor_id(), to_cpumask(args->mask))) {
>>> +		local_irq_disable();
>>> +		flush_tlb_func_local(info);
>> I think this isn't the correct function for PV guests.
>>
>> In fact it should be much easier: just don't clear the own cpu from the
>> mask, that's all what's needed. The hypervisor is just fine having the
>> current cpu in the mask and it will do the right thing.
> Thanks. I will do so in v3. I don’t think Hyper-V people would want to do
> the same, unfortunately, since it would induce VM-exit on TLB flushes.

Why do you believe the vmexit matters?  You're talking one anyway for
the IPI.

Intel only have virtualised self-IPI, and while AMD do have working
non-self IPIs, you still take a vmexit anyway if any destination vcpu
isn't currently running in non-root mode (IIRC).

At that point, you might as well have the hypervisor do all the hard
work via a multi-cpu shootdown/flush hypercall, rather than trying to
arrange it locally.

~Andrew