lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <334ae44077315e2b69529b6fef8d85ec55f80ecf.camel@infradead.org>
Date: Mon, 25 Nov 2024 09:32:14 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Ingo Molnar <mingo@...nel.org>
Cc: kexec@...ts.infradead.org, Thomas Gleixner <tglx@...utronix.de>, Ingo
 Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Dave Hansen
 <dave.hansen@...ux.intel.com>, x86@...nel.org, "H. Peter Anvin"
 <hpa@...or.com>,  "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
 Kai Huang <kai.huang@...el.com>, Nikolay Borisov <nik.borisov@...e.com>,
 linux-kernel@...r.kernel.org, Simon Horman <horms@...nel.org>, Dave Young
 <dyoung@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
 jpoimboe@...nel.org
Subject: Re: [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG

On Mon, 2024-11-25 at 10:21 +0100, Ingo Molnar wrote:
> 
> * David Woodhouse <dwmw2@...radead.org> wrote:
> 
> > From: David Woodhouse <dwmw@...zon.co.uk>
> > 
> > Signed-off-by: David Woodhouse <dwmw@...zon.co.uk>
> > ---
> >  arch/x86/kernel/relocate_kernel_64.S | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
> > index 67f6853c7abe..ebbd76c9a3e9 100644
> > --- a/arch/x86/kernel/relocate_kernel_64.S
> > +++ b/arch/x86/kernel/relocate_kernel_64.S
> > @@ -14,6 +14,8 @@
> >  #include <asm/nospec-branch.h>
> >  #include <asm/unwind_hints.h>
> >  
> > +#define DEBUG
> > +
> >  /*
> >   * Must be relocatable PIC code callable as a C function, in particular
> >   * there must be a plain RET and not jump to return thunk.
> > @@ -191,6 +193,8 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
> >  	pushw	$0xff
> >  	lidt	(%rsp)
> >  	addq	$10, %rsp
> > +
> > +	int3
> >  #endif /* DEBUG */
> 
> That's a really nice piece of debugging code written in assembly, 
> combined with the exception handling feature that generates debug 
> output to begin with. Epic effort. :-)

Thanks :)

> Just curious: did you write this code to debug the series, or was there 
> some original hair-tearing regression that motivated you? Is there's an 
> upstream fix to marvel at and be horrified about in equal measure?

https://lore.kernel.org/all/2ab14f6f-2690-056b-cf9e-38a12dafd728@amd.com/t/#u
is the upstream fix. It's all the more horrifying because it was
already *fixed* upstream before I lost weeks of my life to chasing it.
And the trigger which actually made it *happen*, and made our
production systems allocate memory within that dangerous 1MiB region
adjacent to the RMP table, was a tweak to the NMI watchdog period...
leading to an assumption that we were getting stray perf NMIs during
the kexec, and a *long* wild goose chase based on that false
assumption...

Once I'd written the debug code, I just wanted to clean it up a bit and
push it out for the benefit of others; that *was* the main point of
this series. All the rest of the cleanups are just yak shaving.

The realisation that we never even explicitly mapped the control code
page and always just got lucky because it happened to be in the same
2MiB or 1GiB superpage as something else that we did map... was just a
bonus :)

(That one is fixed in v3 which I'll post shortly, and is already in 
https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/kexec-debug
)

> I'd argue that this debugging code probably needs a default-off Kconfig 
> option, even with the obvious hard-coded environmental limitations & 
> assumptions it has. Could be useful to very early debugging & would 
> preserve your effort without it bitrotting too obviously.

Yeah. In v3 I've made it a config option, and made it use the
early_printk serial console (as long as that's an I/O based 8250; we
can add others too later).

Download attachment "smime.p7s" of type "application/pkcs7-signature" (5965 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ