lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj-D2D_fsOS1La3HGV2=bys5KjeQ0=fskrAL1V9FEkUTHFo1w@mail.gmail.com>
Date:   Sat, 15 Dec 2018 06:31:11 +0800
From:   gengdongjiu <gengdj.1984@...il.com>
To:     James Morse <james.morse@....com>
Cc:     Dongjiu Geng <gengdongjiu@...wei.com>, peter.maydell@...aro.org,
        rkrcmar@...hat.com, corbet@....net, christoffer.dall@....com,
        marc.zyngier@....com, catalin.marinas@....com, will.deacon@....com,
        kvm@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: [RFC RESEND PATCH] kvm: arm64: export memory error recovery
 capability to user space

HI James,

      Thanks for the mail and comments, I will reply to you in the next mail.

2018-12-14 21:55 GMT+08:00, James Morse <james.morse@....com>:
> Hi Dongjiu Geng,
>
> On 14/12/2018 10:15, Dongjiu Geng wrote:
>> When user space do memory recovery, it will check whether KVM and
>> guest support the error recovery, only when both of them support,
>> user space will do the error recovery. This patch exports this
>> capability of KVM to user space.
>
> I can understand user-space only wanting to do the work if host and guest
> support the feature. But 'error recovery' isn't a KVM feature, its a Linux
> kernel feature.
>
> KVM will send it's user-space a SIGBUS with MCEERR code whenever its trying
> to
> map a page at stage2 that the kernel-mm code refuses this because its
> poisoned.
> (e.g. check_user_page_hwpoison(), get_user_pages() returns -EHWPOISON)
>
> This is exactly the same as happens to a normal user-space process.
>
> I think you really want to know if the host kernel was built with
> CONFIG_MEMORY_FAILURE. The not-at-all-portable way to tell this from
> user-space
> is the presence of /proc/sys/vm/memory_failure_* files.
> (It looks like the prctl():PR_MCE_KILL/PR_MCE_KILL_GET options silently
> update
> an ignored policy if the kernel isn't built with CONFIG_MEMORY_FAILURE, so
> they
> aren't helpful)
>
>
>> diff --git a/Documentation/virtual/kvm/api.txt
>> b/Documentation/virtual/kvm/api.txt
>> index cd209f7..241e2e2 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -4895,3 +4895,12 @@ Architectures: x86
>>  This capability indicates that KVM supports paravirtualized Hyper-V IPI
>> send
>>  hypercalls:
>>  HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.
>> +
>> +8.21 KVM_CAP_ARM_MEMORY_ERROR_RECOVERY
>> +
>> +Architectures: arm, arm64
>> +
>> +This capability indicates that guest memory error can be detected by the
>> KVM which
>> +supports the error recovery.
>
> KVM doesn't detect these errors.
> The hardware detects them and notifies the OS via one of a number of
> mechanisms.
> This gets plumbed into memory_failure(), which sets a flag that the mm code
> uses
> to prevent the page being used again.
>
> KVM is only involved when it tries to map a page at stage2 and the mm code
> rejects it with -EHWPOISON. This is the same as the architectures
> do_page_fault() checking for (fault & VM_FAULT_HWPOISON) out of
> handle_mm_fault(). We don't have a KVM cap for this, nor do we need one.
>
>
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index b72a3dd..90d1d9a 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -82,6 +82,7 @@ int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm,
>> long ext)
>>  		r = kvm_arm_support_pmu_v3();
>>  		break;
>>  	case KVM_CAP_ARM_INJECT_SERROR_ESR:
>> +	case KVM_CAP_ARM_MEMORY_ERROR_RECOVERY:
>>  		r = cpus_have_const_cap(ARM64_HAS_RAS_EXTN);
>>  		break;
>
> The CPU RAS Extensions are not at all relevant here. It is perfectly
> possible to
> support memory-failure without them, AMD-Seattle and APM-X-Gene do this.
> These
> systems would report not-supported here, but the kernel does support this
> stuff.
> Just because the CPU supports this, doesn't mean the kernel was built with
> CONFIG_MEMORY_FAILURE. The CPU reports may be ignored, or upgraded to
> SIGKILL.
>
>
>
> Thanks,
>
> James
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ