linux-kernel - Re: [PATCH] x86/paravirt: Guard against invalid cpu # in pv_vcpu_is

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ad016e8f-acb5-6685-b7b6-de801c002dc6@suse.com>
Date:   Mon, 1 Apr 2019 08:38:14 +0200
From:   Juergen Gross <jgross@...e.com>
To:     Waiman Long <longman@...hat.com>,
        Alok Kataria <akataria@...are.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will.deacon@....com>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH] x86/paravirt: Guard against invalid cpu # in
 pv_vcpu_is_preempted()

On 25/03/2019 19:03, Waiman Long wrote:
> On 03/25/2019 12:40 PM, Juergen Gross wrote:
>> On 25/03/2019 16:57, Waiman Long wrote:
>>> It was found that passing an invalid cpu number to pv_vcpu_is_preempted()
>>> might panic the kernel in a VM guest. For example,
>>>
>>> [    2.531077] Oops: 0000 [#1] SMP PTI
>>>   :
>>> [    2.532545] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>>> [    2.533321] RIP: 0010:__raw_callee_save___kvm_vcpu_is_preempted+0x0/0x20
>>>
>>> To guard against this kind of kernel panic, check is added to
>>> pv_vcpu_is_preempted() to make sure that no invalid cpu number will
>>> be used.
>>>
>>> Signed-off-by: Waiman Long <longman@...hat.com>
>>> ---
>>>  arch/x86/include/asm/paravirt.h | 6 ++++++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
>>> index c25c38a05c1c..4cfb465dcde4 100644
>>> --- a/arch/x86/include/asm/paravirt.h
>>> +++ b/arch/x86/include/asm/paravirt.h
>>> @@ -671,6 +671,12 @@ static __always_inline void pv_kick(int cpu)
>>>  
>>>  static __always_inline bool pv_vcpu_is_preempted(long cpu)
>>>  {
>>> +	/*
>>> +	 * Guard against invalid cpu number or the kernel might panic.
>>> +	 */
>>> +	if (WARN_ON_ONCE((unsigned long)cpu >= nr_cpu_ids))
>>> +		return false;
>>> +
>>>  	return PVOP_CALLEE1(bool, lock.vcpu_is_preempted, cpu);
>>>  }
>> Can this really happen without being a programming error?
> 
> This shouldn't happen without a programming error, I think. In my case,
> it was caused by a race condition leading to use-after-free of the cpu
> number. However, my point is that error like that shouldn't cause the
> kernel to panic.
> 
>> Basically you'd need to guard all percpu area accesses to foreign cpus
>> this way. Why is this one special?
> 
> It depends. If out-of-bound access can only happen with obvious
> programming error, I don't think we need to guard against them. In this
> case, I am not totally sure if the race condition that I found may
> happen with existing code or not. To be prudent, I decide to send this
> patch out.
> 
> The race condition that I am looking at is as follows:
> 
>   CPU 0                         CPU 1
>   -----                         -----
> up_write:
>   owner = NULL;
>   <release-barrier>
>   count = 0;
> 
> <rcu-free task structure>
>  
>                           rwsem_can_spin_on_owner:
>                             rcu_read_lock();
>                             read owner;
>                               :
>                             vcpu_is_preempted(owner->cpu);
>                               :
>                             rcu_read_unlock()
> 
> When I tried to merge the owner into the count (clear the owner after
> the barrier), I can reproduce the crash 100% when booting up the kernel
> in a VM guest. However, I am not sure if the configuration above is safe
> and is just very hard to reproduce.
> 
> Alternatively, I can also do the cpu check before calling
> vcpu_is_preempted().

I think I'd prefer that.


Juergen