lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220311112927.8400-1-liam.merwick@oracle.com>
Date:   Fri, 11 Mar 2022 11:29:23 +0000
From:   Liam Merwick <liam.merwick@...cle.com>
To:     kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        stable@...r.kernel.org, x86@...nel.org
Cc:     pbonzini@...hat.com, bp@...en8.de, thomas.lendacky@....com,
        brijesh.singh@....com, krish.sadhukhan@...cle.com,
        liam.merwick@...cle.com
Subject: [PATCH 5.4 0/4] Backport fixes to avoid SEV guest with 380GB+ memory causing host cpu softhang

[ patch series targeting linux-5.4.y stable branch. ]

Creating a SEV-enabled guest with 380GB or more of memory causes a
cpu soft-hang in the host running 5.4 with the following stacktrace:

kernel: watchdog: BUG: soft lockup - CPU#214 stuck for 22s! [qemu-kvm:6424]
...
kernel: CPU: 214 PID: 6424 Comm: qemu-kvm Not tainted 5.4.183.stable #1
kernel: Hardware name: Oracle Corporation ORACLE SERVER E4-2c/Asm,MB
Tray,2U,E4-2c, BIOS 78014000 01/05/2022
kernel: RIP: 0010:clflush_cache_range+0x35/0x40
kernel: Code: f0 0f b7 15 63 53 99 01 89 f6 48 89 d0 48 f7 d8 48 21 f8 48 01 f7
48 39 f8 73 0c 66 0f ae 38 48 01 d0 48 39 c7 77 f4 0f ae f0 <5d> c3 66 0f 1f 84
00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 0f ae
kernel: RSP: 0018:ffffacba5e98fc30 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
kernel: RAX: ffffa1193cfbc000 RBX: ffffacbba4701000 RCX: 0000000000000000
kernel: RDX: 0000000000000040 RSI: 0000000000001000 RDI: ffffa1193cfbc000
kernel: RBP: ffffacba5e98fc30 R08: ffffacba5f44aca0 R09: ffffacbac3701000
kernel: R10: 0000000000000080 R11: ffff9f8500000af0 R12: ffffa18074a22f80
kernel: R13: ffffacbaf6889dd8 R14: ffffacba5f41d960 R15: ffffacba5f44aca0
kernel: FS:  00007fbe04321f00(0000) GS:ffffa1814ed80000(0000)
knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007dfbebd7d000 CR3: 000801fb93d68002 CR4: 0000000000760ee0
kernel: PKRU: 55555554
kernel: Call Trace:
kernel: sev_clflush_pages.part.56+0x50/0x70 [kvm_amd]
kernel: svm_register_enc_region+0xe2/0x120 [kvm_amd]
kernel: kvm_arch_vm_ioctl+0x524/0xbd0 [kvm]
kernel: ? release_pages+0x212/0x430
kernel: ? __pagevec_lru_add_fn+0x192/0x2f0
kernel: kvm_vm_ioctl+0x9c/0x9d0 [kvm]
kernel: ? __lru_cache_add+0x59/0x70
kernel: ? lru_cache_add_active_or_unevictable+0x39/0xb0
kernel: ? __handle_mm_fault+0xa74/0xfd0
kernel: ? __switch_to_asm+0x34/0x70
kernel: do_vfs_ioctl+0xa9/0x640
kernel: ? __audit_syscall_entry+0xdd/0x130
kernel: ksys_ioctl+0x67/0x90
kernel: __x64_sys_ioctl+0x1a/0x20
kernel: do_syscall_64+0x60/0x1d0
kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
kernel: RIP: 0033:0x7fbe0086563b
kernel: Code: 0f 1e fa 48 8b 05 4d b8 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff
ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff
73 01 c3 48 8b 0d 1d b8 2c 00 f7 d8 64 89 01 48
kernel: RSP: 002b:00007ffedf577418 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
kernel: RAX: ffffffffffffffda RBX: 00007dfbebe00000 RCX: 00007fbe0086563b
kernel: RDX: 00007ffedf577490 RSI: ffffffff8010aebb RDI: 000000000000000d
kernel: RBP: 000001c200000000 R08: ffffffffffffffff R09: ffffffffffffffff
kernel: R10: ffffffffffffffff R11: 0000000000000246 R12: 000001c200000000
kernel: R13: 00007dfbebe00000 R14: 0000000000000000 R15: 0000000000000000

The problem is the time spent flushing the caches when pinning memory for
SEV but it's unnecessary as it turns out - it is resolved by backporting
the following commits from Linux 5.10

e1ebb2b49048 KVM: SVM: Don't flush cache if hardware enforces cache coherency
across encryption domains
(conflict due to the function it fixed being moved in a refactoring in 5.7).

along with 3 other commits needed as dependencies.
(fbd5969d1ff2 avoids a conflict in 5866e9205b47)

fbd5969d1ff2 x86/cpufeatures: Mark two free bits in word 3
5866e9205b47 x86/cpu: Add hardware-enforced cache coherency as a CPUID feature
75d1cc0e05af x86/mm/pat: Don't flush cache if hardware enforces cache coherency across encryption domnains                                                                                                        

Tested by creating various sized guests up to 1.8TB, with and without SEV enabled,
running a few benchmarks and passing kvm-unit-tests.


Borislav Petkov (1):
  x86/cpufeatures: Mark two free bits in word 3

Krish Sadhukhan (3):
  x86/cpu: Add hardware-enforced cache coherency as a CPUID feature
  x86/mm/pat: Don't flush cache if hardware enforces cache coherency
    across encryption domnains
  KVM: SVM: Don't flush cache if hardware enforces cache coherency
    across encryption domains

 arch/x86/include/asm/cpufeatures.h | 2 ++
 arch/x86/kernel/cpu/scattered.c    | 1 +
 arch/x86/kvm/svm.c                 | 3 ++-
 arch/x86/mm/pageattr.c             | 2 +-
 4 files changed, 6 insertions(+), 2 deletions(-)

-- 
2.27.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ