lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 10 Aug 2022 01:18:30 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Borislav Petkov <bp@...en8.de>, ira.weiny@...el.com
Cc:     Rik van Riel <riel@...riel.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, kernel-team@...com,
        Frederic Weisbecker <frederic@...nel.org>,
        Juergen Gross <jgross@...e.com>,
        Mark Rutland <mark.rutland@....com>,
        Andrew Cooper <andrew.cooper3@...rix.com>
Subject: Re: [RFC PATCH 1/5] entry: Pass pt_regs to
 irqentry_exit_cond_resched()

On Mon, Aug 08 2022 at 12:38, Borislav Petkov wrote:
> On Fri, Aug 05, 2022 at 10:30:05AM -0700, ira.weiny@...el.com wrote:
>> ---
>>  arch/arm64/include/asm/preempt.h |  2 +-
>>  arch/arm64/kernel/entry-common.c |  4 ++--
>>  arch/x86/entry/common.c          |  2 +-
>>  include/linux/entry-common.h     | 17 ++++++++------
>>  kernel/entry/common.c            | 13 +++++++----
>>  kernel/sched/core.c              | 40 ++++++++++++++++----------------
>>  6 files changed, 43 insertions(+), 35 deletions(-)
>
> Why all this churn?
>
> Why can't you add a parameter to irqentry_exit():
>
>   noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state, bool cond_resched);
>
> and then have all callers except xen_pv_evtchn_do_upcall() pass in false
> and this way have all exit paths end up in irqentry_exit()?
>
> And, ofc, move the true case which is the body of
> raw_irqentry_exit_cond_resched() to irqentry_exit() and then get rid of
> former.
>
> xen_pv_evtchn_do_upcall() will, ofc, do:
>
>         if (inhcall && !WARN_ON_ONCE(state.exit_rcu)) {
>                 irqentry_exit(regs, state, true);
>                 instrumentation_end();
>                 restore_inhcall(inhcall);
>         } else {
>                 instrumentation_end();
>                 irqentry_exit(regs, state, false);
>

How is that less churn? Care to do:

    git grep 'irqentry_exit(' arch/

The real question is:

    Why is it required that irqentry_exit_cond_resched() handles any of
    the auxiliary pt regs space?
    
That's completely unanswered by the changelog and absolutely irrelevant
for the problem at hand, i.e. storing the CPU number on irq/exception
entry.

    So why is this patch in this series at all?

But even for future purposes it is more than questionable. Why?

Contrary to the claim of the changelog xen_pv_evtchn_do_upcall() is not
really a bypass of irqentry_exit(). The point is:

The hypercall is issued by the kernel from privcmd_ioctl_hypercall()
which does:

      xen_preemptible_hcall_begin();
      hypercall();
      xen_preemptible_hcall_end();

So any upcall from the hypervisor to the guest will semantically hit
regular interrupt enabled guest kernel space which means that if that
code would call irqentry_exit() then it would run through the kernel
exit code path:

	} else if (!regs_irqs_disabled(regs)) {

		instrumentation_begin();
		if (IS_ENABLED(CONFIG_PREEMPTION))
			irqentry_exit_cond_resched();

		/* Covers both tracing and lockdep */
		trace_hardirqs_on();
		instrumentation_end();
       } ....

Which would fail to invoke irqentry_exit_cond_resched() if
CONFIG_PREEMPTION=n.  But the whole point of this exercise is to allow
preemption from the upcall even when the kernel has CONFIG_PREEMPTION=n.

Staring at this XENPV code there are two issues:

  1) That code lacks a trace_hardirqs_on() after the call to
     irqentry_exit_cond_resched(). My bad.

  2) CONFIG_PREEMPT_DYNAMIC broke this mechanism.

     If the static call/key is disabled then the call becomes a NOP.

     Frederic?

Both clearly demonstrate how well tested this XEN_PV muck is which means
we should just delete it.

Anyway. This wants the fix below.

If there is still a need to do anything about this XEN cond_resched()
muck for the PREEMPTION=n case for future auxregs usage then this can be
simply hidden in this new XEN helper and does not need any change to the
rest of the code.

I doubt that this is required, but if so then there needs to be a very
concise explanation and not just uncomprehensible hand waving word
salad.

Thanks,

        tglx
---
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -283,9 +283,18 @@ static __always_inline void restore_inhc
 {
 	__this_cpu_write(xen_in_preemptible_hcall, inhcall);
 }
+
+static __always_inline void xenpv_irqentry_exit_cond_resched(void)
+{
+	instrumentation_begin();
+	raw_irqentry_exit_cond_resched();
+	trace_hardirqs_on();
+	instrumentation_end();
+}
 #else
 static __always_inline bool get_and_clear_inhcall(void) { return false; }
 static __always_inline void restore_inhcall(bool inhcall) { }
+static __always_inline void xenpv_irqentry_exit_cond_resched(void) { }
 #endif
 
 static void __xen_pv_evtchn_do_upcall(struct pt_regs *regs)
@@ -306,11 +315,11 @@ static void __xen_pv_evtchn_do_upcall(st
 
 	instrumentation_begin();
 	run_sysvec_on_irqstack_cond(__xen_pv_evtchn_do_upcall, regs);
+	instrumentation_end();
 
 	inhcall = get_and_clear_inhcall();
 	if (inhcall && !WARN_ON_ONCE(state.exit_rcu)) {
-		irqentry_exit_cond_resched();
-		instrumentation_end();
+		xenpv_irqentry_exit_cond_resched();
 		restore_inhcall(inhcall);
 	} else {
 		instrumentation_end();

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ