[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5061713D.5060406@redhat.com>
Date: Tue, 25 Sep 2012 10:54:21 +0200
From: Avi Kivity <avi@...hat.com>
To: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>
CC: Peter Zijlstra <peterz@...radead.org>,
Rik van Riel <riel@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
Srikar <srikar@...ux.vnet.ibm.com>,
"Nikunj A. Dadhania" <nikunj@...ux.vnet.ibm.com>,
KVM <kvm@...r.kernel.org>, Jiannan Ouyang <ouyang@...pitt.edu>,
chegu vinod <chegu_vinod@...com>,
"Andrew M. Theurer" <habanero@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>,
Srivatsa Vaddagiri <srivatsa.vaddagiri@...il.com>,
Gleb Natapov <gleb@...hat.com>
Subject: Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE
handler
On 09/25/2012 10:09 AM, Raghavendra K T wrote:
> On 09/24/2012 09:36 PM, Avi Kivity wrote:
>> On 09/24/2012 05:41 PM, Avi Kivity wrote:
>>>
>>>>
>>>> case 2)
>>>> rq1 : vcpu1->wait(lockA) (spinning)
>>>> rq2 : vcpu3 (running) , vcpu2->holding(lockA) [scheduled out]
>>>>
>>>> I agree that checking rq1 length is not proper in this case, and as
>>>> you
>>>> rightly pointed out, we are in trouble here.
>>>> nr_running()/num_online_cpus() would give more accurate picture here,
>>>> but it seemed costly. May be load balancer save us a bit here in not
>>>> running to such sort of cases. ( I agree load balancer is far too
>>>> complex).
>>>
>>> In theory preempt notifier can tell us whether a vcpu is preempted or
>>> not (except for exits to userspace), so we can keep track of whether
>>> it's we're overcommitted in kvm itself. It also avoids false positives
>>> from other guests and/or processes being overcommitted while our vm
>>> is fine.
>>
>> It also allows us to cheaply skip running vcpus.
>
> Hi Avi,
>
> Could you please elaborate on how preempt notifiers can be used
> here to keep track of overcommit or skip running vcpus?
>
> Are we planning set some flag in sched_out() handler etc?
>
Keep a bitmap kvm->preempted_vcpus.
In sched_out, test whether we're TASK_RUNNING, and if so, set a vcpu
flag and our bit in kvm->preempted_vcpus. On sched_in, if the flag is
set, clear our bit in kvm->preempted_vcpus. We can also keep a counter
of preempted vcpus.
We can use the bitmap and the counter to quickly see if spinning is
worthwhile (if the counter is zero, better to spin). If not, we can use
the bitmap to select target vcpus quickly.
The only problem is that in order to keep this accurate we need to keep
the preempt notifiers active during exits to userspace. But we can
prototype this without this change, and add it later if it works.
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists