[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120621122633.GA3842@amt.cnet>
Date: Thu, 21 Jun 2012 09:26:33 -0300
From: Marcelo Tosatti <mtosatti@...hat.com>
To: Nikunj A Dadhania <nikunj@...ux.vnet.ibm.com>
Cc: peterz@...radead.org, mingo@...e.hu, avi@...hat.com,
raghukt@...ux.vnet.ibm.com, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, x86@...nel.org, jeremy@...p.org,
vatsa@...ux.vnet.ibm.com, hpa@...or.com
Subject: Re: [PATCH v2 3/7] KVM: Add paravirt kvm_flush_tlb_others
On Tue, Jun 19, 2012 at 11:41:54AM +0530, Nikunj A Dadhania wrote:
> Hi Marcelo,
>
> Thanks for the review.
>
> On Tue, 12 Jun 2012 20:02:18 -0300, Marcelo Tosatti <mtosatti@...hat.com> wrote:
> > On Mon, Jun 04, 2012 at 10:37:24AM +0530, Nikunj A. Dadhania wrote:
> > > flush_tlb_others_ipi depends on lot of statics in tlb.c. Replicated
> > > the flush_tlb_others_ipi as kvm_flush_tlb_others to further adapt to
> > > paravirtualization.
> > >
> > > Use the vcpu state information inside the kvm_flush_tlb_others to
> > > avoid sending ipi to pre-empted vcpus.
> > >
> > > * Do not send ipi's to offline vcpus and set flush_on_enter flag
> > > * For online vcpus: Wait for them to clear the flag
> > >
> > > The approach was discussed here: https://lkml.org/lkml/2012/2/20/157
> > >
> > > Suggested-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> > > Signed-off-by: Nikunj A. Dadhania <nikunj@...ux.vnet.ibm.com>
> >
> > Why not reintroduce the hypercall to flush TLBs?
> Sure, I will get a version with this.
>
> > No waiting, no entry/exit trickery.
> >
> Are you also suggesting to get rid of vcpu-running information?
Yes.
> We would atleast need this to raise a flushTLB hypercall on the sleeping
> vcpu.
No, you always raise a flush TLB hypercall on flush_tlb_others, like Xen does.
Instead of an MMIO exit to APIC to IPI, its a hypercall exit.
> > This is half-way-there paravirt with all the downsides.
> Some more details on what are the downsides would help us reach to a
> better solution.
The downside i refer to is overhead to write, test and maintain separate code
for running as a guest. If you are adding awareness about hypervisor
anyway, a hypercall is the simplest way:
- It is simple to request a remote TLB flush via vcpu->requests in the
hypervisor. Information about which vcpus are in/out guest mode
already available.
- Maintenance of in/out guest mode information in guest memory adds more
overhead to entry/exit paths.
- No need to handle the interprocessor synchronization in the pseudo
algo below (already handled by vcpu->requests mechanism).
- The guest TLB IPI flow is (target vcpu):
- exit-due-to-external-interrupt.
- inject-ipi.
- enter guest mode.
- execute IPI handler, flush TLB.
- ACK IPI.
- source vcpu continues.
- The hypervisor TLB flush flow is:
- exit-due-to-external-interrupt.
- execute IPI handler (in host, from make_all_cpus_request)
- ACK IPI.
- source vcpu continues.
Unless there is an advantage in using APIC IPI exit vs hypercall
exit?
> > Even though the guest running information might be useful in other
> > cases.
> >
> Yes, that was one of the things on the mind.
>
> > > Pseudo Algo:
> > >
> > > Write()
> > > ======
> > >
> > > guest_exit()
> > > flush_on_enter[i]=0;
> > > running[i] = 0;
> > >
> > > guest_enter()
> > > running[i] = 1;
> > > smp_mb();
> > > if(flush_on_enter[i]) {
> > > tlb_flush()
> > > flush_on_enter[i]=0;
> > > }
> > >
> > >
> > > Read()
> > > ======
> > >
> > > GUEST KVM-HV
> > >
> > > f->flushcpumask = cpumask - me;
> > >
> > > again:
> > > for_each_cpu(i, f->flushmask) {
> > >
> > > if (!running[i]) {
> > > case 1:
> > >
> > > running[n]=1
> > >
> > > (cpuN does not see
> > > flush_on_enter set,
> > > guest later finds it
> > > running and sends ipi,
> > > we are fine here, need
> > > to clear the flag on
> > > guest_exit)
> > >
> > > flush_on_enter[i] = 1;
> > > case2:
> > >
> > > running[n]=1
> > > (cpuN - will see flush
> > > on enter and an IPI as
> > > well - addressed in patch-4)
> > >
> > > if (!running[i])
> > > cpu_clear(f->flushmask); All is well, vm_enter
> > > will do the fixup
> > > }
> > > case 3:
> > > running[n] = 0;
> > >
> > > (cpuN went to sleep,
> > > we saw it as awake,
> > > ipi sent, but wait
> > > will break without
> > > zero_mask and goto
> > > again will take care)
> > >
> > > }
> > > send_ipi(f->flushmask)
> > >
> > > wait_a_while_for_zero_mask();
> > >
> > > if (!zero_mask)
> > > goto again;
> > > ---
> > > arch/x86/include/asm/kvm_para.h | 3 +-
> > > arch/x86/include/asm/tlbflush.h | 9 ++++++
> > > arch/x86/kernel/kvm.c | 1 +
> > > arch/x86/kvm/x86.c | 14 ++++++++-
> > > arch/x86/mm/tlb.c | 61 +++++++++++++++++++++++++++++++++++++++
> > > 5 files changed, 86 insertions(+), 2 deletions(-)
> > >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@...r.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists