lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3cf8b953c449320cc4c085924ef0e2eed5eadcf7.camel@intel.com>
Date:   Wed, 6 Dec 2023 01:28:08 +0000
From:   "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To:     "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "x86@...nel.org" <x86@...nel.org>, "bp@...en8.de" <bp@...en8.de>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>
CC:     "kexec@...ts.infradead.org" <kexec@...ts.infradead.org>,
        "Reshetova, Elena" <elena.reshetova@...el.com>,
        "Nakajima, Jun" <jun.nakajima@...el.com>,
        "rafael@...nel.org" <rafael@...nel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "Huang, Kai" <kai.huang@...el.com>,
        "sathyanarayanan.kuppuswamy@...ux.intel.com" 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        "Hunter, Adrian" <adrian.hunter@...el.com>,
        "thomas.lendacky@....com" <thomas.lendacky@....com>,
        "ashish.kalra@....com" <ashish.kalra@....com>,
        "linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
        "seanjc@...gle.com" <seanjc@...gle.com>,
        "bhe@...hat.com" <bhe@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCHv4 10/14] x86/tdx: Convert shared memory back to private on
 kexec

On Tue, 2023-12-05 at 03:45 +0300, Kirill A. Shutemov wrote: 
> +static void tdx_kexec_unshare_mem(bool crash)
> +{
> +       unsigned long addr, end;
> +       long found = 0, shared;
> +
> +       /* Stop new private<->shared conversions */
> +       conversion_allowed = false;

I wonder if this might need a compiler barrier here to be totally safe.
I'm not sure.

> +
> +       /*
> +        * Crash kernel reaches here with interrupts disabled: can't
> wait for
> +        * conversions to finish.
> +        *
> +        * If race happened, just report and proceed.
> +        */
> +       if (!crash) {
> +               unsigned long timeout;
> +
> +               /*
> +                * Wait for in-flight conversions to complete.
> +                *
> +                * Do not wait more than 30 seconds.
> +                */
> +               timeout = 30 * USEC_PER_SEC;
> +               while (atomic_read(&conversions_in_progress) &&
> timeout--)
> +                       udelay(1);
> +       }
> +
> +       if (atomic_read(&conversions_in_progress))
> +               pr_warn("Failed to finish shared<->private
> conversions\n");

I can't think of any non-ridiculous way to handle this case. Maybe we
need VMM help.

> 
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index 830425e6d38e..c81afffaa954 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -12,6 +12,7 @@
>  #include <linux/delay.h>
>  #include <linux/objtool.h>
>  #include <linux/pgtable.h>
> +#include <linux/kexec.h>
>  #include <acpi/reboot.h>
>  #include <asm/io.h>
>  #include <asm/apic.h>
> @@ -31,6 +32,7 @@
>  #include <asm/realmode.h>
>  #include <asm/x86_init.h>
>  #include <asm/efi.h>
> +#include <asm/tdx.h>
>  
>  /*
>   * Power off function, if any
> @@ -716,6 +718,14 @@ static void
> native_machine_emergency_restart(void)
>  
>  void native_machine_shutdown(void)
>  {
> +       /*
> +        * Call enc_kexec_unshare_mem() while all CPUs are still
> active and
> +        * interrupts are enabled. This will allow all in-flight
> memory
> +        * conversions to finish cleanly before unsharing all memory.
> +        */
> +       if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) &&
> kexec_in_progress)
> +               x86_platform.guest.enc_kexec_unshare_mem(false);

These questions are coming from an incomplete understanding of the
kexec/reboot operation. Please disregard if it is not helpful.

By doing this while other tasks can still run, it handles the
conversion races in the !crash case. But then it sets shared pages to
NP. What happens if another active task tries to write to one?

I guess we rely on the kernel_restart_prepare()->device_shutdown() to
clean up, which runs before native_machine_shutdown(). So there might
be conversions in progress when tdx_kexec_unshare_mem() is called, from
the allocator work queues. But the actual memory won't be accessed
during that operation.

But the console must be active? Or otherwise who can see these
warnings. It doesn't use a shared page? Or the KVM clock, which looks
to clean up at cpu tear down, which now happens after
tdx_kexec_unshare_mem()? So I wonder if there might be cases.

If so, maybe you could halt the conversions in
native_machine_shutdown(), then do the actual reset to private after
tasks can't schedule. I'd still wonder about if anything might try to
access a shared page triggered by the console output.


> +
>         /* Stop the cpus and apics */
>  #ifdef CONFIG_X86_IO_APIC
>         /*

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ