lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFA6WYOCz2uFU81YTA+GpWhq3k=bfjBKJhKST1VjnjW2GTVq3A@mail.gmail.com>
Date:   Mon, 9 May 2022 13:25:21 +0530
From:   Sumit Garg <sumit.garg@...aro.org>
To:     Daniel Thompson <daniel.thompson@...aro.org>
Cc:     linux-arm-kernel@...ts.infradead.org, dianders@...omium.org,
        will@...nel.org, liwei391@...wei.com, catalin.marinas@....com,
        mark.rutland@....com, mhiramat@...nel.org,
        jason.wessel@...driver.com, maz@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] arm64: entry: Skip single stepping interrupt handlers

Hi Daniel,

On Fri, 6 May 2022 at 21:36, Daniel Thompson <daniel.thompson@...aro.org> wrote:
>
> On Wed, Apr 13, 2022 at 12:24:57PM +0530, Sumit Garg wrote:
> > Current implementation allows single stepping into interrupt handlers
> > for interrupts that were received during single stepping. But interrupt
> > handlers aren't something that the user expect to debug. Moreover single
> > stepping interrupt handlers is risky as it may sometimes leads to
> > unbalanced locking when we resume from single-step debug.
>
> Why does single stepping interrupt handlers cause unbalanced locking
> (with the current code)?
>

I have to dig deeper to completely answer this question but this is
based on observation from following warning splats seen while single
stepping interrupt handlers:

[  300.328300] WARNING: bad unlock balance detected!
[  300.328608] 5.18.0-rc1-00016-g3e732ebf7316-dirty #6 Not tainted
[  300.329058] -------------------------------------
[  300.329298] sh/173 is trying to release lock (dbg_slave_lock) at:
[  300.329718] [<ffffd57c951c016c>] kgdb_cpu_enter+0x7ac/0x820
[  300.330029] but there are no more locks to release!
[  300.330265]
[  300.330265] other info that might help us debug this:
[  300.330668] 4 locks held by sh/173:
[  300.330891]  #0: ffff4f5e454d8438 (sb_writers#3){.+.+}-{0:0}, at:
vfs_write+0x98/0x204
[  300.331735]  #1: ffffd57c973bc2f0 (dbg_slave_lock){+.+.}-{2:2}, at:
kgdb_cpu_enter+0x5b4/0x820
[  300.332259]  #2: ffffd57c973a9460 (rcu_read_lock){....}-{1:2}, at:
kgdb_cpu_enter+0xe0/0x820
[  300.332717]  #3: ffffd57c973bc2a8 (dbg_master_lock){....}-{2:2},
at: kgdb_cpu_enter+0x1ec/0x820

>
> > Fix broken single-step implementation via skipping single-step over
> > interrupt handlers. The methodology is when we receive an interrupt from
> > EL1, check if we are single stepping (pstate.SS). If yes then we save
> > MDSCR_EL1.SS and clear the register bit if it was set. Then unmask only
> > D and leave I set. On return from the interrupt, set D and restore
> > MDSCR_EL1.SS. Along with this skip reschedule if we were stepping.
>
> Does this description really explains the motivation here.
>
> Isn't the goal to meet the user's expectation that when they step then
> the system will stop at PC+4 relative the instruction they are stepping
> (or PC+I for a branch that gets taken)?
>

Yeah this matches my understanding as well.

> To be clear, I've no objections to leaving interrupt handling enabled
> when single stepping (AFAIK it is similar to what x86 does) but I think
> this patch description will be difficult for future adventurers to
> comprehend.
>

Okay, so does the below patch description sound apt?

"
    Current implementation allows single stepping into interrupt handlers
    for interrupts that were received during single stepping. But interrupt
    handlers aren't something that the user expect to debug. Moreover single
    stepping interrupt handlers is risky as it may sometimes leads to
    unbalanced locking when we resume from single-step debug:

    [  300.328300] WARNING: bad unlock balance detected!
    [  300.328608] 5.18.0-rc1-00016-g3e732ebf7316-dirty #6 Not tainted
    [  300.329058] -------------------------------------
    [  300.329298] sh/173 is trying to release lock (dbg_slave_lock) at:
    [  300.329718] [<ffffd57c951c016c>] kgdb_cpu_enter+0x7ac/0x820
    [  300.330029] but there are no more locks to release!
    [  300.330265]
    [  300.330265] other info that might help us debug this:
    [  300.330668] 4 locks held by sh/173:
    [  300.330891]  #0: ffff4f5e454d8438 (sb_writers#3){.+.+}-{0:0},
at: vfs_write+0x98/0x204
    [  300.331735]  #1: ffffd57c973bc2f0 (dbg_slave_lock){+.+.}-{2:2},
at: kgdb_cpu_enter+0x5b4/0x820
    [  300.332259]  #2: ffffd57c973a9460 (rcu_read_lock){....}-{1:2},
at: kgdb_cpu_enter+0xe0/0x820
    [  300.332717]  #3: ffffd57c973bc2a8
(dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x1ec/0x820

    The common user's expectation while single stepping is that when they
    step then the system will stop at PC+4 or PC+I for a branch that gets
    taken relative to the instruction they are stepping. So, fix broken single
    step implementation via skipping single stepping interrupt handlers.

    The methodology is when we receive an interrupt from EL1, check if we
    are single stepping (pstate.SS). If yes then we save MDSCR_EL1.SS and
    clear the register bit if it was set. Then unmask only D and leave I set. On
    return from the interrupt, set D and restore MDSCR_EL1.SS. Along with this
    skip reschedule if we were stepping.
"

-Sumit

>
> Daniel.
>
>
> > Suggested-by: Will Deacon <will@...nel.org>
> > Signed-off-by: Sumit Garg <sumit.garg@...aro.org>
> > ---
> >  arch/arm64/kernel/entry-common.c | 18 +++++++++++++++++-
> >  1 file changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> > index 878c65aa7206..dd2d3af615de 100644
> > --- a/arch/arm64/kernel/entry-common.c
> > +++ b/arch/arm64/kernel/entry-common.c
> > @@ -458,19 +458,35 @@ static __always_inline void __el1_irq(struct pt_regs *regs,
> >       do_interrupt_handler(regs, handler);
> >       irq_exit_rcu();
> >
> > -     arm64_preempt_schedule_irq();
> > +     /* Don't reschedule in case we are single stepping */
> > +     if (!(regs->pstate & DBG_SPSR_SS))
> > +             arm64_preempt_schedule_irq();
> >
> >       exit_to_kernel_mode(regs);
> >  }
> > +
> >  static void noinstr el1_interrupt(struct pt_regs *regs,
> >                                 void (*handler)(struct pt_regs *))
> >  {
> > +     unsigned long reg;
> > +
> > +     /* Disable single stepping within interrupt handler */
> > +     if (regs->pstate & DBG_SPSR_SS) {
> > +             reg = read_sysreg(mdscr_el1);
> > +             write_sysreg(reg & ~DBG_MDSCR_SS, mdscr_el1);
> > +     }
> > +
> >       write_sysreg(DAIF_PROCCTX_NOIRQ, daif);
> >
> >       if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && !interrupts_enabled(regs))
> >               __el1_pnmi(regs, handler);
> >       else
> >               __el1_irq(regs, handler);
> > +
> > +     if (regs->pstate & DBG_SPSR_SS) {
> > +             write_sysreg(DAIF_PROCCTX_NOIRQ | PSR_D_BIT, daif);
> > +             write_sysreg(reg, mdscr_el1);
> > +     }
> >  }
> >
> >  asmlinkage void noinstr el1h_64_irq_handler(struct pt_regs *regs)
> > --
> > 2.25.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ