[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <506C7057.6000102@redhat.com>
Date: Wed, 03 Oct 2012 19:05:27 +0200
From: Avi Kivity <avi@...hat.com>
To: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
CC: Rik van Riel <riel@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
Srikar <srikar@...ux.vnet.ibm.com>,
"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
KVM <kvm@...r.kernel.org>, Jiannan Ouyang <ouyang@...pitt.edu>,
chegu vinod <chegu_vinod@...com>,
"Andrew M. Theurer" <habanero@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>,
Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>,
Gleb Natapov <gleb@...hat.com>
Subject: Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE
handler
On 10/03/2012 02:22 PM, Raghavendra K T wrote:
>> So I think it's worth trying again with ple_window of 20000-40000.
>>
>
> Hi Avi,
>
> I ran different benchmarks increasing ple_window, and results does not
> seem to be encouraging for increasing ple_window.
Thanks for testing! Comments below.
> Results:
> 16 core PLE machine with 16 vcpu guest.
>
> base kernel = 3.6-rc5 + ple handler optimization patch
> base_pleopt_8k = base kernel + ple window = 8k
> base_pleopt_16k = base kernel + ple window = 16k
> base_pleopt_32k = base kernel + ple window = 32k
>
>
> Percentage improvements of benchmarks w.r.t base_pleopt with ple_window = 4096
>
> base_pleopt_8k base_pleopt_16k base_pleopt_32k
> -----------------------------------------------------------------
> kernbench_1x -5.54915 -15.94529 -44.31562
> kernbench_2x -7.89399 -17.75039 -37.73498
So, 44% degradation even with no overcommit? That's surprising.
> I also got perf top output to analyse the difference. Difference comes
> because of flushtlb (and also spinlock).
That's in the guest, yes?
>
> Ebizzy run for 4k ple_window
> - 87.20% [kernel] [k] arch_local_irq_restore
> - arch_local_irq_restore
> - 100.00% _raw_spin_unlock_irqrestore
> + 52.89% release_pages
> + 47.10% pagevec_lru_move_fn
> - 5.71% [kernel] [k] arch_local_irq_restore
> - arch_local_irq_restore
> + 86.03% default_send_IPI_mask_allbutself_phys
> + 13.96% default_send_IPI_mask_sequence_phys
> - 3.10% [kernel] [k] smp_call_function_many
> smp_call_function_many
>
>
> Ebizzy run for 32k ple_window
>
> - 91.40% [kernel] [k] arch_local_irq_restore
> - arch_local_irq_restore
> - 100.00% _raw_spin_unlock_irqrestore
> + 53.13% release_pages
> + 46.86% pagevec_lru_move_fn
> - 4.38% [kernel] [k] smp_call_function_many
> smp_call_function_many
> - 2.51% [kernel] [k] arch_local_irq_restore
> - arch_local_irq_restore
> + 90.76% default_send_IPI_mask_allbutself_phys
> + 9.24% default_send_IPI_mask_sequence_phys
>
Both the 4k and the 32k results are crazy. Why is
arch_local_irq_restore() so prominent? Do you have a very high
interrupt rate in the guest?
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists