linux-kernel - Re: [PATCH] arm/arm64: KVM: send SIGBUS error to qemu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <58D3E469.8090408@arm.com>
Date:   Thu, 23 Mar 2017 15:06:17 +0000
From:   James Morse <james.morse@....com>
To:     Dongjiu Geng <gengdongjiu@...wei.com>
CC:     rkrcmar@...hat.com, christoffer.dall@...aro.org,
        marc.zyngier@....com, linux@...linux.org.uk, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        kvmarm@...ts.cs.columbia.edu, xiexiuqi@...wei.com,
        wangxiongfeng2@...wei.com, wuquanming@...wei.com,
        huangshaoyu@...wei.com
Subject: Re: [PATCH] arm/arm64: KVM: send SIGBUS error to qemu

Hi Dongjiu Geng,

On 23/03/17 13:01, Dongjiu Geng wrote:
> when the pfn is KVM_PFN_ERR_HWPOISON, it indicates to send
> SIGBUS signal from KVM's fault-handling code to qemu, qemu
> can handle this signal according to the fault address.

I'm afraid I beat you to it on this one:
https://www.spinics.net/lists/arm-kernel/msg568919.html

(Are you the same gengdj who ask me to post that patch?:
 https://lkml.org/lkml/2017/3/5/187 )

We don't need upstream KVM to do this until either arm or arm64 has
ARCH_SUPPORTS_MEMORY_FAILURE. Punit and Tyler have discovered problems with the
way arm64's hugepage and hwpoison interact:
https://www.spinics.net/lists/arm-kernel/msg568995.html


Some comments on the differences:

> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 962616fd4ddd..1307ec400de3 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -1237,6 +1237,20 @@ static void coherent_cache_guest_page(struct kvm_vcpu *vcpu, kvm_pfn_t pfn,
>  	__coherent_cache_guest_page(vcpu, pfn, size);
>  }
>  
> +static void kvm_send_hwpoison_signal(unsigned long address,
> +					struct task_struct *tsk)
> +{
> +	siginfo_t info;
> +
> +	info.si_signo	= SIGBUS;
> +	info.si_errno	= 0;
> +	info.si_code	= BUS_MCEERR_AR;
> +	info.si_addr	= (void __user *)address;
> +	info.si_addr_lsb = PAGE_SHIFT;

Any version of this patch should handle hugepage for the sizes KVM uses in its
stage2 mappings. By just passing PAGE_SHIFT you let the guest fault for each
page that makes up the hugepage.


> +
> +	send_sig_info(SIGBUS, &info, tsk);
> +}
> +
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  			  struct kvm_memory_slot *memslot, unsigned long hva,
>  			  unsigned long fault_status)
> @@ -1309,6 +1323,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	if (is_error_noslot_pfn(pfn))
>  		return -EFAULT;
>  
> +	if (is_error_hwpoison_pfn(pfn)) {
> +		kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn),
> +								current);
> +		return -EFAULT;

This will return -EFAULT from the KVM_RUN ioctl(). Is Qemu expected to know it
should try again? This is indistinguishable from the is_error_noslot_pfn() error
above.

x86 returns 0 from this path, kvm_handle_bad_page() in arch/x86/kvm/mmu.c as the
SIGBUS should arrive first. If the SIGBUS is handled the error has been resolved
and Qemu can call KVM_RUN again. Returning an error and sending SIGBUS suggests
there are two problems.


> +	}
> +
>  	if (kvm_is_device_pfn(pfn)) {
>  		mem_type = PAGE_S2_DEVICE;
>  		flags |= KVM_S2PTE_FLAG_IS_IOMAP;



Thanks,

James