lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 30 Mar 2022 09:52:37 -0700
From:   Mingwei Zhang <mizhang@...gle.com>
To:     Peter Gonda <pgonda@...gle.com>
Cc:     kvm <kvm@...r.kernel.org>, Sean Christopherson <seanjc@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] KVM: SEV: Add cond_resched() to loop in sev_clflush_pages()

On Wed, Mar 30, 2022 at 9:43 AM Peter Gonda <pgonda@...gle.com> wrote:
>
> Add resched to avoid warning from sev_clflush_pages() with large number
> of pages.
>
> Signed-off-by: Peter Gonda <pgonda@...gle.com>
> Cc: Sean Christopherson <seanjc@...gle.com>
> Cc: kvm@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org
>
> ---
> Here is a warning similar to what I've seen many times running large SEV
> VMs:
> [  357.714051] CPU 15: need_resched set for > 52000222 ns (52 ticks) without schedule
> [  357.721623] WARNING: CPU: 15 PID: 35848 at kernel/sched/core.c:3733 scheduler_tick+0x2f9/0x3f0
> [  357.730222] Modules linked in: kvm_amd uhaul vfat fat hdi2_standard_ftl hdi2_megablocks hdi2_pmc hdi2_pmc_eeprom hdi2 stg elephant_dev_num ccp i2c_mux_ltc4306 i2c_mux i2c_via_ipmi i2c_piix4 google_bmc_usb google_bmc_gpioi2c_mb_common google_bmc_mailbox cdc_acm xhci_pci xhci_hcd sha3_generic gq nv_p2p_glue accel_class
> [  357.758261] CPU: 15 PID: 35848 Comm: switchto-defaul Not tainted 4.15.0-smp-DEV #11
> [  357.765912] Hardware name: Google, Inc.                                                       Arcadia_IT_80/Arcadia_IT_80, BIOS 30.20.2-gce 11/05/2021
> [  357.779372] RIP: 0010:scheduler_tick+0x2f9/0x3f0
> [  357.783988] RSP: 0018:ffff98558d1c3dd8 EFLAGS: 00010046
> [  357.789207] RAX: 741f23206aa8dc00 RBX: 0000005349236a42 RCX: 0000000000000007
> [  357.796339] RDX: 0000000000000006 RSI: 0000000000000002 RDI: ffff98558d1d5a98
> [  357.803463] RBP: ffff98558d1c3ea0 R08: 0000000000100ceb R09: 0000000000000000
> [  357.810597] R10: ffff98558c958c00 R11: ffffffff94850740 R12: 00000000031975de
> [  357.817729] R13: 0000000000000000 R14: ffff98558d1e2640 R15: ffff98525739ea40
> [  357.824862] FS:  00007f87503eb700(0000) GS:ffff98558d1c0000(0000) knlGS:0000000000000000
> [  357.832948] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  357.838695] CR2: 00005572fe74b080 CR3: 0000007bea706006 CR4: 0000000000360ef0
> [  357.845828] Call Trace:
> [  357.848277]  <IRQ>
> [  357.850294]  [<ffffffff94411420>] ? tick_setup_sched_timer+0x130/0x130
> [  357.856818]  [<ffffffff943ed60d>] ? rcu_sched_clock_irq+0x6ed/0x850
> [  357.863084]  [<ffffffff943fdf02>] ? __run_timers+0x42/0x260
> [  357.868654]  [<ffffffff94411420>] ? tick_setup_sched_timer+0x130/0x130
> [  357.875182]  [<ffffffff943fd35b>] update_process_times+0x7b/0x90
> [  357.881188]  [<ffffffff944114a2>] tick_sched_timer+0x82/0xd0
> [  357.886845]  [<ffffffff94400671>] __run_hrtimer+0x81/0x200
> [  357.892331]  [<ffffffff943ff222>] hrtimer_interrupt+0x192/0x450
> [  357.898252]  [<ffffffff950002fa>] ? __do_softirq+0x2fa/0x33e
> [  357.903911]  [<ffffffff94e02edc>] smp_apic_timer_interrupt+0xac/0x1d0
> [  357.910349]  [<ffffffff94e01ef6>] apic_timer_interrupt+0x86/0x90
> [  357.916347]  </IRQ>
> [  357.918452] RIP: 0010:clflush_cache_range+0x3f/0x50
> [  357.923324] RSP: 0018:ffff98529af89cc0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12
> [  357.930889] RAX: 0000000000000040 RBX: 0000000000038135 RCX: ffff985233d36000
> [  357.938013] RDX: ffff985233d36000 RSI: 0000000000001000 RDI: ffff985233d35000
> [  357.945145] RBP: ffff98529af89cc0 R08: 0000000000000001 R09: ffffb5753fb23000
> [  357.952271] R10: 000000000003fe00 R11: 0000000000000008 R12: 0000000000040000
> [  357.959401] R13: ffff98525739ea40 R14: ffffb5753fb22000 R15: ffff98532a58dd80
> [  357.966536]  [<ffffffffc07afd41>] svm_register_enc_region+0xd1/0x170 [kvm_amd]
> [  357.973758]  [<ffffffff94246e8c>] kvm_arch_vm_ioctl+0x84c/0xb00
> [  357.979677]  [<ffffffff9455980f>] ? handle_mm_fault+0x6ff/0x1370
> [  357.985683]  [<ffffffff9423412b>] kvm_vm_ioctl+0x69b/0x720
> [  357.991167]  [<ffffffff945dfd9d>] do_vfs_ioctl+0x47d/0x680
> [  357.996654]  [<ffffffff945e0188>] SyS_ioctl+0x68/0x90
> [  358.001706]  [<ffffffff942066f1>] do_syscall_64+0x71/0x110
> [  358.007192]  [<ffffffff94e00081>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
>
> Tested by running a large 256gib SEV VM several times, saw no warnings.
> Without the change warnings are seen.
>
> ---
>  arch/x86/kvm/svm/sev.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 75fa6dd268f0..c2fe89ecdb2d 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -465,6 +465,7 @@ static void sev_clflush_pages(struct page *pages[], unsigned long npages)
>                 page_virtual = kmap_atomic(pages[i]);
>                 clflush_cache_range(page_virtual, PAGE_SIZE);
>                 kunmap_atomic(page_virtual);
> +               cond_resched();

If you add cond_resched() here, the frequency (once per 4K) might be
too high. You may want to do it once per X pages, where X could be
something like 1G/4K?

>         }
>  }
>
> --
> 2.35.1.1094.g7c7d902a7c-goog
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ