linux-kernel - Re: [PATCH v2 0/4] implement vcpu preempted check

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANRm+CxKVqrFB-cRPEvmUamZ5Y8cOGe3pyc0LMJy0t9_jo+i0Q@mail.gmail.com>
Date:	Thu, 7 Jul 2016 18:27:26 +0800
From:	Wanpeng Li <kernellwp@...il.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Paolo Bonzini <pbonzini@...hat.com>,
	Pan Xinhui <xinhui.pan@...ux.vnet.ibm.com>,
	linux-s390 <linux-s390@...r.kernel.org>,
	Davidlohr Bueso <dave@...olabs.net>, mpe@...erman.id.au,
	boqun.feng@...il.com, will.deacon@....com,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Waiman Long <waiman.long@....com>,
	virtualization@...ts.linux-foundation.org,
	Ingo Molnar <mingo@...hat.com>,
	Paul Mackerras <paulus@...ba.org>, benh@...nel.crashing.org,
	schwidefsky@...ibm.com, Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	linuxppc-dev@...ts.ozlabs.org, kvm <kvm@...r.kernel.org>
Subject: Re: [PATCH v2 0/4] implement vcpu preempted check

2016-07-07 18:12 GMT+08:00 Wanpeng Li <kernellwp@...il.com>:
> 2016-07-07 17:42 GMT+08:00 Peter Zijlstra <peterz@...radead.org>:
>> On Thu, Jul 07, 2016 at 04:48:05PM +0800, Wanpeng Li wrote:
>>> 2016-07-06 20:28 GMT+08:00 Paolo Bonzini <pbonzini@...hat.com>:
>>> > Hmm, you're right.  We can use bit 0 of struct kvm_steal_time's flags to
>>> > indicate that pad[0] is a "VCPU preempted" field; if pad[0] is 1, the
>>> > VCPU has been scheduled out since the last time the guest reset the bit.
>>> >  The guest can use an xchg to test-and-clear it.  The bit can be
>>> > accessed at any time, independent of the version field.
>>>
>>> If one vCPU is preempted, and guest check it several times before this
>>> vCPU is scheded in, then the first time we can get "vCPU is
>>> preempted", however, since the field is cleared, the second time we
>>> will get "vCPU is running".
>>>
>>> Do you mean we should call record_steal_time() in both kvm_sched_in()
>>> and kvm_sched_out() to record this field? Btw, if we should keep both
>>> vcpu->preempted and kvm_steal_time's "vCPU preempted" field present
>>> simultaneous?
>>
>> I suspect you want something like so; except this has holes in.
>>
>> We clear KVM_ST_PAD_PREEMPT before disabling preemption and we set it
>> after enabling it, this means that if we get preempted in between, the
>> vcpu is reported as running even though it very much is not.
>
> Paolo also point out this to me offline yesterday: "Please change
> pad[12] to "__u32 preempted; __u32 pad[11];" too, and remember to
> update Documentation/virtual/kvm/msr.txt!". Btw, do this in preemption
> notifier means that the vCPU is real preempted on host, however,
> depends on vmexit is different semantic I think.

In addition, I see xen's vcpu_runstate_info::state is updated during
schedule, so I think I can do this similarly through kvm preemption
notifier. IIUC, xen hypervisor has VCPUOP_get_runstate_info hypercall
implemention, so the desired interface can be implemented if they add
hypercall callsite in domU. I can add hypercall to kvm similarly.

Paolo, thoughts?

Regards,
Wanpeng Li