lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKhg4tJjp3yymCTDFpCQJiekos3265AcuBMuCw5TkZUvjCvg1g@mail.gmail.com>
Date:   Tue, 24 Jul 2018 15:53:17 +0800
From:   Liang C <liangchen.linux@...il.com>
To:     pbonzini@...hat.com, rkrcmar@...hat.com
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: VM boot failure on nodes not having DMA32 zone

Hi,

We have a situation where our qemu processes need to be launched under
cgroup cpuset.mems control. This introduces an similar issue that was
discussed a few years ago. The difference here is that for our case,
not being able to allocate from DMA32 zone is a result a cgroup
restriction not mempolicy enforcement. Here is the steps to reproduce
the failure,

mkdir /sys/fs/cgroup/cpuset/nodeX (where X is a node not having DMA32 zone)
echo X > /sys/fs/cgroup/cpuset/nodeX/cpuset.mems
echo X > /sys/fs/cgroup/cpuset/nodeX/cpuset.cpus
echo 1 > /sys/fs/cgroup/cpuset/node0/cpuset.mem_hardwall
echo $$ > /sys/fs/cgroup/cpuset/nodeX/tasks

#launch a virtual machine
kvm_init_vcpu failed: Cannot allocate memory

There are workarounds, like always putting qemu processes onto the
node with DMA32 zone or not restricting qemu processes memory
allocation until that DMA32 alloc finishes (difficult to be precise).
But we would like to find a way to address the root cause.

Considering the fact that the pae_root shadow should not be needed
when ept is in use, which is indeed our case - ept is always available
for us (guessing this is the same case for most of other users), we
made a patch roughly like this,

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index d594690..1d1b61e 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5052,7 +5052,7 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
        vcpu->arch.mmu.translate_gpa = translate_gpa;
        vcpu->arch.nested_mmu.translate_gpa = translate_nested_gpa;

-       return alloc_mmu_pages(vcpu);
+       return tdp_enabled ? 0 : alloc_mmu_pages(vcpu);
 }

 void kvm_mmu_setup(struct kvm_vcpu *vcpu)


It works through our test cases. But we would really like to have your
insight on this patch before applying it in production environment and
contributing it back to the community. Thanks in advance for any help
you may provide!

Thanks,
Liang Chen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ