linux-kernel - Re: [PATCH 6/5] x86/fault: Clean up the page fault oops decoder a bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrW00oTpXOONwbONHHuiyqp9QbNMe0gvVgf8X3_X0fidqw@mail.gmail.com>
Date:   Tue, 4 Dec 2018 11:22:25 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     "Christopherson, Sean J" <sean.j.christopherson@...el.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Andrew Lutomirski <luto@...nel.org>, X86 ML <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Yu-cheng Yu <yu-cheng.yu@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH 6/5] x86/fault: Clean up the page fault oops decoder a bit

On Tue, Nov 27, 2018 at 7:32 AM Sean Christopherson
<sean.j.christopherson@...el.com> wrote:
>
> On Thu, Nov 22, 2018 at 09:41:19AM +0100, Ingo Molnar wrote:
> >
> > * Andy Lutomirski <luto@...nel.org> wrote:
> >
> > > One of Linus' favorite hobbies seems to be looking at OOPSes and
> > > decoding the error code in his head.  This is not one of my favorite
> > > hobbies :)
> > >
> > > Teach the page fault OOPS hander to decode the error code.  If it's
> > > a !USER fault from user mode, print an explicit note to that effect
> > > and print out the addresses of various tables that might cause such
> > > an error.
> > >
> > > With this patch applied, if I intentionally point the LDT at 0x0 and
> > > run the x86 selftests, I get:
> > >
> > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> > > HW error: normal kernel read fault
> > > This was a system access from user code
> > > IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f)
> > > LDTR: 0x50 -- base=0x0 limit=0xfff7
> > > TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f
> > > PGD 800000000456e067 P4D 800000000456e067 PUD 4623067 PMD 0
> > > SMP PTI
> > > CPU: 0 PID: 153 Comm: ldt_gdt_64 Not tainted 4.19.0+ #1317
> > > Hardware name: ...
> > > RIP: 0033:0x401454
> >
> > I've applied your series, with one small edit, the following message:
> >
> >   > HW error: normal kernel read fault
> >
> > will IMHO confuse the heck out of users, thinking that their hardware is
> > broken...
> >
> > Yes, the message is accurate, in MM pagefault language it's indeed the HW
> > error code, but it's a language very few people speak.
> >
> > So I edited it over to say '#PF error code'. I also applied a few other
> > minor cleanups - see the changelog below.
>
> I responded to the original thread a hair too late...
>
> What about something like this instead of manually handling the case
> where error_code==0 so that we get e.g. "[KERNEL] [READ]" instead of
> "normal kernel read fault"?  Getting "[PROT] [KERNEL] [READ]" seems
> useful.
>
> IMO "[normal kernel read fault]" followed by "This was a system access
> from user code" is still confusing.
>
> ---
> 8b29ee4351d5c625aa9ca2765f8da5e Mon Sep 17 00:00:00 2001
> From: Sean Christopherson <sean.j.christopherson@...el.com>
> Date: Tue, 27 Nov 2018 07:09:57 -0800
> Subject: [PATCH] x86/fault: Print "KERNEL" and "READ" for #PF error codes
>
> ...and explicitly state that it's a "code" that's being printed.
>
> Cc: Andy Lutomirski <luto@...nel.org>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: Dave Hansen <dave.hansen@...ux.intel.com>
> Cc: H. Peter Anvin <hpa@...or.com>
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Rik van Riel <riel@...riel.com>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Yu-cheng Yu <yu-cheng.yu@...el.com>
> Cc: linux-kernel@...r.kernel.org
> Cc: Ingo Molnar <mingo@...nel.org>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
> ---
>  arch/x86/mm/fault.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 2ff25ad33233..510e263c256b 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -660,8 +660,10 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad
>         err_str_append(error_code, err_txt, X86_PF_RSVD,  "[RSVD]" );
>         err_str_append(error_code, err_txt, X86_PF_INSTR, "[INSTR]");
>         err_str_append(error_code, err_txt, X86_PF_PK,    "[PK]"   );
> -
> -       pr_alert("#PF error: %s\n", error_code ? err_txt : "[normal kernel read fault]");
> +       err_str_append(~error_code, err_txt, X86_PF_USER, "[KERNEL]");
> +       err_str_append(~error_code, err_txt, X86_PF_WRITE | X86_PF_INSTR,
> +                                                         "[READ]");
> +       pr_alert("#PF error code: %s\n", err_txt);
>

Seems generally nice, but I would suggest making the bit-not-set name
be another parameter to err_str_append().  I'm also slightly uneasy
about making "KERNEL" look like a bit, but I guess it doesn't bother
me too much.

Want to send a real patch?