[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <506D69AB.7020400@linux.vnet.ibm.com>
Date: Thu, 04 Oct 2012 16:19:15 +0530
From: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
To: Avi Kivity <avi@...hat.com>
CC: Rik van Riel <riel@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
Srikar <srikar@...ux.vnet.ibm.com>,
"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
KVM <kvm@...r.kernel.org>, Jiannan Ouyang <ouyang@...pitt.edu>,
chegu vinod <chegu_vinod@...com>,
"Andrew M. Theurer" <habanero@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>,
Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>,
Gleb Natapov <gleb@...hat.com>
Subject: Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE
handler
On 10/03/2012 10:35 PM, Avi Kivity wrote:
> On 10/03/2012 02:22 PM, Raghavendra K T wrote:
>>> So I think it's worth trying again with ple_window of 20000-40000.
>>>
>>
>> Hi Avi,
>>
>> I ran different benchmarks increasing ple_window, and results does not
>> seem to be encouraging for increasing ple_window.
>
> Thanks for testing! Comments below.
>
>> Results:
>> 16 core PLE machine with 16 vcpu guest.
>>
>> base kernel = 3.6-rc5 + ple handler optimization patch
>> base_pleopt_8k = base kernel + ple window = 8k
>> base_pleopt_16k = base kernel + ple window = 16k
>> base_pleopt_32k = base kernel + ple window = 32k
>>
>>
>> Percentage improvements of benchmarks w.r.t base_pleopt with ple_window = 4096
>>
>> base_pleopt_8k base_pleopt_16k base_pleopt_32k
>> -----------------------------------------------------------------
>> kernbench_1x -5.54915 -15.94529 -44.31562
>> kernbench_2x -7.89399 -17.75039 -37.73498
>
> So, 44% degradation even with no overcommit? That's surprising.
Yes. Kernbench was run with #threads = #vcpu * 2 as usual. Is it
spending 8 times the original ple_window cycles for 16 vcpus
significant?
>
>> I also got perf top output to analyse the difference. Difference comes
>> because of flushtlb (and also spinlock).
>
> That's in the guest, yes?
Yes. Perf is in guest.
>
>>
>> Ebizzy run for 4k ple_window
>> - 87.20% [kernel] [k] arch_local_irq_restore
>> - arch_local_irq_restore
>> - 100.00% _raw_spin_unlock_irqrestore
>> + 52.89% release_pages
>> + 47.10% pagevec_lru_move_fn
>> - 5.71% [kernel] [k] arch_local_irq_restore
>> - arch_local_irq_restore
>> + 86.03% default_send_IPI_mask_allbutself_phys
>> + 13.96% default_send_IPI_mask_sequence_phys
>> - 3.10% [kernel] [k] smp_call_function_many
>> smp_call_function_many
>>
>>
>> Ebizzy run for 32k ple_window
>>
>> - 91.40% [kernel] [k] arch_local_irq_restore
>> - arch_local_irq_restore
>> - 100.00% _raw_spin_unlock_irqrestore
>> + 53.13% release_pages
>> + 46.86% pagevec_lru_move_fn
>> - 4.38% [kernel] [k] smp_call_function_many
>> smp_call_function_many
>> - 2.51% [kernel] [k] arch_local_irq_restore
>> - arch_local_irq_restore
>> + 90.76% default_send_IPI_mask_allbutself_phys
>> + 9.24% default_send_IPI_mask_sequence_phys
>>
>
> Both the 4k and the 32k results are crazy. Why is
> arch_local_irq_restore() so prominent? Do you have a very high
> interrupt rate in the guest?
How to measure if I have high interrupt rate in guest?
From /proc/interrupt numbers I am not able to judge :(
I went back and got the results on a 32 core machine with 32 vcpu guest.
Strangely, I got result supporting the claim that increasing ple_window
helps for non-overcommitted scenario.
32 core 32 vcpu guest 1x scenarios.
ple_gap = 0
kernbench: Elapsed Time 38.61
ebizzy: 7463 records/s
ple_window = 4k
kernbench: Elapsed Time 43.5067
ebizzy: 2528 records/s
ple_window = 32k
kernebench : Elapsed Time 39.4133
ebizzy: 7196 records/s
perf top for ebizzy for above:
ple_gap = 0
- 84.74% [kernel] [k] arch_local_irq_restore
- arch_local_irq_restore
- 100.00% _raw_spin_unlock_irqrestore
+ 50.96% release_pages
+ 49.02% pagevec_lru_move_fn
- 6.57% [kernel] [k] arch_local_irq_restore
- arch_local_irq_restore
+ 92.54% default_send_IPI_mask_allbutself_phys
+ 7.46% default_send_IPI_mask_sequence_phys
- 1.54% [kernel] [k] smp_call_function_many
smp_call_function_many
ple_window = 32k
- 84.47% [kernel] [k] arch_local_irq_restore
+ arch_local_irq_restore
- 6.46% [kernel] [k] arch_local_irq_restore
- arch_local_irq_restore
+ 93.51% default_send_IPI_mask_allbutself_phys
+ 6.49% default_send_IPI_mask_sequence_phys
- 1.80% [kernel] [k] smp_call_function_many
- smp_call_function_many
+ 99.98% native_flush_tlb_others
ple_window = 4k
- 91.35% [kernel] [k] arch_local_irq_restore
- arch_local_irq_restore
- 100.00% _raw_spin_unlock_irqrestore
+ 53.19% release_pages
+ 46.81% pagevec_lru_move_fn
- 3.90% [kernel] [k] smp_call_function_many
smp_call_function_many
- 2.94% [kernel] [k] arch_local_irq_restore
- arch_local_irq_restore
+ 93.12% default_send_IPI_mask_allbutself_phys
+ 6.88% default_send_IPI_mask_sequence_phys
Let me know if I can try something here..
/me confused :(
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists