lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54D3BD79.6090602@redhat.com>
Date:	Thu, 05 Feb 2015 13:59:05 -0500
From:	Rik van Riel <riel@...hat.com>
To:	paulmck@...ux.vnet.ibm.com
CC:	Christian Borntraeger <borntraeger@...ibm.com>,
	kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
	mtosatti@...hat.com, mingo@...nel.org, ak@...ux.intel.com,
	oleg@...hat.com, masami.hiramatsu.pt@...achi.com,
	fweisbec@...il.com, lcapitulino@...hat.com, pbonzini@...hat.com
Subject: Re: [PATCH 4/4] kvm,rcu: use RCU extended quiescent state when running
 KVM guest

On 02/05/2015 01:56 PM, Paul E. McKenney wrote:
> On Thu, Feb 05, 2015 at 01:09:19PM -0500, Rik van Riel wrote:
>> On 02/05/2015 12:50 PM, Paul E. McKenney wrote:
>>> On Thu, Feb 05, 2015 at 11:52:37AM -0500, Rik van Riel wrote:
>>>> On 02/05/2015 11:44 AM, Christian Borntraeger wrote:
>>>>> Am 05.02.2015 um 17:35 schrieb riel@...hat.com:
>>>>>> From: Rik van Riel <riel@...hat.com>
>>>>>>
>>>>>> The host kernel is not doing anything while the CPU is executing
>>>>>> a KVM guest VCPU, so it can be marked as being in an extended
>>>>>> quiescent state, identical to that used when running user space
>>>>>> code.
>>>>>>
>>>>>> The only exception to that rule is when the host handles an
>>>>>> interrupt, which is already handled by the irq code, which
>>>>>> calls rcu_irq_enter and rcu_irq_exit.
>>>>>>
>>>>>> The guest_enter and guest_exit functions already switch vtime
>>>>>> accounting independent of context tracking, so leave those calls
>>>>>> where they are, instead of moving them into the context tracking
>>>>>> code.
>>>>>>
>>>>>> Signed-off-by: Rik van Riel <riel@...hat.com>
>>>>>> ---
>>>>>>  include/linux/context_tracking.h       | 8 +++++++-
>>>>>>  include/linux/context_tracking_state.h | 1 +
>>>>>>  2 files changed, 8 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
>>>>>> index bd9f000fc98d..a5d3bb44b897 100644
>>>>>> --- a/include/linux/context_tracking.h
>>>>>> +++ b/include/linux/context_tracking.h
>>>>>> @@ -43,7 +43,7 @@ static inline enum ctx_state exception_enter(void)
>>>>>>  static inline void exception_exit(enum ctx_state prev_ctx)
>>>>>>  {
>>>>>>  	if (context_tracking_is_enabled()) {
>>>>>> -		if (prev_ctx == IN_USER)
>>>>>> +		if (prev_ctx == IN_USER || prev_ctx == IN_GUEST)
>>>>>>  			context_tracking_user_enter(prev_ctx);
>>>>>>  	}
>>>>>>  }
>>>>>> @@ -78,6 +78,9 @@ static inline void guest_enter(void)
>>>>>>  		vtime_guest_enter(current);
>>>>>>  	else
>>>>>>  		current->flags |= PF_VCPU;
>>>>>> +
>>>>>> +	if (context_tracking_is_enabled())
>>>>>> +		context_tracking_user_enter(IN_GUEST);
>>>>>>  }
>>>>>
>>>>>
>>>>> Couldnt we make
>>>>>     rcu_virt_note_context_switch(smp_processor_id());
>>>>> conditional in include/linux/kvm_host.h (kvm_guest_enter)
>>>>>
>>>>> e.g. something like
>>>>>     if (!context_tracking_is_enabled())
>>>>> 	    rcu_virt_note_context_switch(smp_processor_id());
>>>>
>>>> Possibly. I considered the same, but I do not know whether
>>>> or not just rcu_user_enter / rcu_user_exit is enough.
>>>>
>>>> I could certainly try it out and see whether anything
>>>> explodes, but I am not convinced that is careful enough
>>>> when it comes to handling RCU code...
>>>>
>>>> Paul? :)
>>>
>>> That can fail for some odd combinations of Kconfig and boot parameters.
>>> As I understand it, you instead want the following:
>>>
>>> 	if (!tick_nohz_full_cpu(smp_processor_id()))
>>> 		rcu_virt_note_context_switch(smp_processor_id());
>>>
>>> Frederic might know of a better approach.
>>
>> Interesting, in what cases would that happen?
> 
> My concern, perhaps misplaced, is that context_tracking_is_enabled(),
> but that the current CPU is not a nohz_full= CPU.  In that case, the
> context-tracking code would be within its rights to not tell RCU about
> the transition to the guest, and thus RCU would be unaware that the
> CPU was in a quiescent state.
> 
> In most workloads, you would expect the CPU to get interrupted or
> preempted or something at some point, which would eventually inform
> RCU, but too much delay would result in the infamous RCU CPU stall
> warning message.  So let's code a bit more defensively.
> 
>> If context_tracking_is_enabled() we end up eventually
>> calling into rcu_user_enter & rcu_user_exit.
>>
>> If it is not enabled, we call rcu_virt_note_context_switch.
>>
>> In what cases would we need to call both?
>>
>> I don't see exit-to-userspace do the things that
>> rcu_virt_note_context_switch does, and do not understand
>> why userspace is fine with that...
> 
> The real danger is doing neither.
> 
> On tick_nohz_full_cpu() CPUs, the exit-to-userspace code should invoke
> rcu_user_enter(), which sets some per-CPU state telling RCU to ignore
> that CPU, since it cannot possibly do host RCU read-side critical sections
> while running a guest.

Ahhh, I understand now.  Thank you for your
explanation, Paul.

I will make sure your suggestion is in version 2 of
this series.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ