linux-kernel - Re: [tip: x86/entry] x86/entry: Treat BUG/WARN as NMI-like entries

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrXisDDMb_eaPDq1DWrMuSqo1hDrOd14u7fSR4J_RxJu_A@mail.gmail.com>
Date:   Mon, 15 Jun 2020 15:46:00 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Andy Lutomirski <luto@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-tip-commits@...r.kernel.org,
        Thomas Gleixner <tglx@...utronix.de>, x86 <x86@...nel.org>
Subject: Re: [tip: x86/entry] x86/entry: Treat BUG/WARN as NMI-like entries

On Mon, Jun 15, 2020 at 3:23 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Mon, Jun 15, 2020 at 02:08:16PM -0700, Andy Lutomirski wrote:
>
> > > All !user exceptions really should be NMI-like. If you want to go
> > > overboard, I suppose you can look at IF and have them behave interrupt
> > > like when set, but why make things complicated.
> >
> > This entire rabbit hole opened because of #PF. So we at least need the
> > set of exceptions that are permitted to schedule if they came from
> > kernel mode to remain schedulable.
>
> What exception, other than #PF, actually needs to schedule from kernel?
>
> > Prior to the giant changes, all the non-IST *exceptions*, but not the
> > interrupts, were schedulable from kernel mode, assuming the original
> > context could schedule. Right now, interrupts can schedule, too, which
> > is nice if we ever want to fully clean up the Xen abomination. I
> > suppose we could make it so #PF opts in to special treatment again,
> > but we should decide that the result is simpler or otherwise better
> > before we do this.
> >
> > One possible justification would be that the schedulable entry variant
> > is more complicated, and most kernel exceptions except the ones with
> > fixups are bad news, and we want the oopses to succeed. But page
> > faults are probably the most common source of oopses, so this is a bit
> > weak, and we really want page faults to work even from nasty contexts.
>
> I think I'd prefer the argument of consistent failure.
>
> Do we ever want #UD to schedule? If not, then why allow it to sometimes
> schedule and sometimes fail, better to always fail.
>
> #DB is still a giant trainwreck in this regard as well.
>
> Something like this...
>
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -216,10 +216,25 @@ static inline void handle_invalid_op(str
>                       ILL_ILLOPN, error_get_trap_addr(regs));
>  }
>
> -DEFINE_IDTENTRY_RAW(exc_invalid_op)
> +static void handle_invalid_op_kernel(struct pt_regs *regs)
> +{
> +       if (is_valid_bugaddr(regs->ip) &&
> +           report_bug(regs->ip, regs) == BUG_TRAP_TYPE_WARN) {
> +               /* Skip the ud2. */
> +               regs->ip += LEN_UD2;
> +               return;
> +       }
> +
> +       handle_invalid_op(regs);
> +}
> +
> +static void handle_invalid_op_user(struct pt_regs *regs)
>  {
> -       bool rcu_exit;
> +       handle_invalid_op(regs);
> +}
>
> +DEFINE_IDTENTRY_RAW(exc_invalid_op)
> +{

Meh, I guess I'm okay with this.

In some sense, #UD and #PF are fundamentally different.  #PF wants to
be able to schedule in the kernel.  #UD wants to be as minimal as
possible in the kernel but probably still wants to do the nmi_enter()
dance in case it's an RCU warning and the warning handler code wants
to use RCU.

One solution would be to get rid of ud2 for warnings and replace it
with CALL warning_thunk :)  But I guess I'm okay with your patch.

--Andy