linux-kernel - Re: [RFC RESEND PATCH] kvm: arm64: export memory error recovery capability to user space

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <0184EA26B2509940AA629AE1405DD7F2020453D5@dggema523-mbx.china.huawei.com>
Date:   Thu, 10 Jan 2019 15:30:08 +0000
From:   gengdongjiu <gengdongjiu@...wei.com>
To:     Peter Maydell <peter.maydell@...aro.org>
CC:     James Morse <james.morse@....com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Christoffer Dall <christoffer.dall@....com>,
        Marc Zyngier <marc.zyngier@....com>,
        "Catalin Marinas" <catalin.marinas@....com>,
        Will Deacon <will.deacon@....com>,
        kvm-devel <kvm@...r.kernel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        lkml - Kernel Mailing List <linux-kernel@...r.kernel.org>,
        arm-mail-list <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [RFC RESEND PATCH] kvm: arm64: export memory error recovery
 capability to user space

> 
> On Thu, 10 Jan 2019 at 12:09, gengdongjiu <gengdongjiu@...wei.com> wrote:
> > Peter, I summarize James's main idea, James think QEMU does not needs
> > to check *something* if Qemu support firmware-first.
> > What do we do for your comments?
> 
> Unless I'm missing something, the code in your most recent patchset attempts to update an ACPI table when it gets the SIGBUS from the
> host kernel without doing anything to check whether it has ever created the ACPI table (and set up the QEMU global variable that tells the
> code where it is in the guest memory) in the first place.

when QEMU version is greater than some version, it will default create the APEI table. But only when the guest is booted by UEFI, it will support to record the CPER to guest memory. 
In my test, I boot guest using UEFI, so it is no problem, I will check whether this booting uses UEFI before update the ACPI table.

> I don't see how that can work.	
> 
> > >> I think one question here which it would be good to answer is:
> > >> if we are modelling a guest and we haven't specifically provided it
> > >> an ACPI table to tell it about memory errors, what do we do when we
> > >> get a sigbus from the host? We have basically two choices:
> > >>  (1) send the guest an SError (aka asynchronous external abort)
> > >>      anyway (with no further info about what the memory error is)
> > >
> > > For an AR signal an external abort is valid. Its up to the
> > > implementation whether these are synchronous or asynchronous. Qemu
> > > can only take a signal for something that was synchronous, so you can choose between the two.
> > > Synchronous external abort is marginally better as an unaware OS
> > > knows its affects this thread, and may be able to kill it.
> > > SError with an imp-def ESR is indistinguishable from 'part of the
> > > soc fell out', and should always result in a panic().
> > >
> > >
> > >>  (2) just stop QEMU (as we would for a memory error in QEMU's
> > >>      own memory)
> > >
> > > This is also valid. A machine may take external-abort to EL3 and
> > > then reboot/crash/burn.
> 
> We should decide which of these we want to do, and have a comment explaining what we're doing. If I'm reading your current patchset
> correctly, it does neither -- if it can't record the fault in the ACPI table it just ignores it without either stopping QEMU or delivering an SError.

James may not know my detailed implementation in the QEMU. In my patch, I only handle the BUS_MCEERR_AR SIGBUS signal(synchronous signal). when the SIGBUS is BUS_MCEERR_AR, it will deliver a synchronous exception abort.
James said it needs to deliver an SError when the BUS_MCEERR_OR SIGBUS signal(synchronous signal), but I do not handle the this case because QEMU main thread will mask this asynchronous signal.

If the memory error is belong to QEMU itself, I just print an error log[2]. If you think, it should stop QEMU for this case, I will change it.

void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr) {
	..................
   	if (code == BUS_MCEERR_AR) {
    	kvm_cpu_synchronize_state(c);
       	if (ghes_record_errors(ACPI_HEST_NOTIFY_SEA, paddr)) {
     		kvm_inject_arm_sea(c);
     	} else {
       		fprintf(stderr, "failed to record the error\n");
       	}
	}
[2]	fprintf(stderr, "Hardware memory error for memory used by "
                "QEMU itself instead of guest system!\n");
}
> 
> I think I favour option (2) here.
> 
> thanks
> -- PMM