lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=FV=WXoDvWuH=yjzCcqOZ5CeUtYun7C8zrtrBP4FC409GkqA@mail.gmail.com>
Date:   Fri, 1 Jul 2022 15:14:16 -0700
From:   Doug Anderson <dianders@...omium.org>
To:     Sumit Garg <sumit.garg@...aro.org>
Cc:     Daniel Thompson <daniel.thompson@...aro.org>,
        Will Deacon <will@...nel.org>, Wei Li <liwei391@...wei.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Mark Rutland <mark.rutland@....com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Jason Wessel <jason.wessel@...driver.com>,
        Marc Zyngier <maz@...nel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 0/2] arm64: Fix pending single-step debugging issues

Hi,

On Tue, May 10, 2022 at 11:05 PM Sumit Garg <sumit.garg@...aro.org> wrote:
>
> This patch-set reworks pending fixes from Wei's series [1] to make
> single-step debugging via kgdb/kdb on arm64 work as expected. There was
> a prior discussion on ML [2] regarding if we should keep the interrupts
> enabled during single-stepping. So patch #1 follows suggestion from Will
> [3] to not disable interrupts during single stepping but rather skip
> single stepping within interrupt handler.
>
> [1] https://lore.kernel.org/all/20200509214159.19680-1-liwei391@huawei.com/
> [2] https://lore.kernel.org/all/CAD=FV=Voyfq3Qz0T3RY+aYWYJ0utdH=P_AweB=13rcV8GDBeyQ@mail.gmail.com/
> [3] https://lore.kernel.org/all/20200626095551.GA9312@willie-the-truck/
>
> Changes in v3:
> - Reword commit descriptions as per Daniel's suggestions.
>
> Changes in v2:
> - Replace patch #1 to rather follow Will's suggestion.
>
> Sumit Garg (2):
>   arm64: entry: Skip single stepping into interrupt handlers
>   arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step
>
>  arch/arm64/include/asm/debug-monitors.h |  1 +
>  arch/arm64/kernel/debug-monitors.c      |  5 +++++
>  arch/arm64/kernel/entry-common.c        | 18 +++++++++++++++++-
>  arch/arm64/kernel/kgdb.c                |  2 ++
>  4 files changed, 25 insertions(+), 1 deletion(-)

Sorry it took so long for me to respond. I kept dreaming that I'd find
the time to really dig deep into this to understand it fully and I'm
finally giving up on it. I'm going to hope that Will and/or Catalin
knows this area of the code well and can give it a good review. If not
then I'll strive harder to make the time...

In any case, I poked around with this a bunch and it definitely
improved the stepping behavior a whole lot. I still got one case where
gdb hit an assertion while I was stepping, but I could believe that
was a problem with gdb? I couldn't reproduce it. Thus I can at least
give:

Tested-by: Douglas Anderson <dianders@...omium.org>

I'll also note that I _think_ I remember that with Wei's series that
the gdb function "call" started working. I tried that here and it
didn't seem so happy. To keep things simple, I created a dummy
function in my kernel that looked like:

void doug_test(void)
{
  pr_info("testing, 1 2 3\n");
}

I broke into the debugger by echoing "g" to /proc/sysrq-trigger and
then tried "call doug_test()". I guess my printout actually printed
but it wasn't so happy after that. Seems like it somehow ended up
returning to a bogus address after the call which then caused a crash.

  testing, 1 2 3
  BUG: sleeping function called from invalid context at
arch/arm64/mm/fault.c:593
  in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 3393, name: bash
  preempt_count: 0, expected: 0
  RCU nest depth: 1, expected: 0
  CPU: 6 PID: 3393 Comm: bash Not tainted 5.19.0-rc4+ #3
dbec0bdb8582e447bccdcf2e70d7fe04477b1aac
  Hardware name: Google Herobrine (rev1+) (DT)
  Call trace:
   dump_backtrace+0xf0/0x110
   show_stack+0x24/0x70
   dump_stack_lvl+0x64/0x7c
   dump_stack+0x18/0x38
   __might_resched+0x144/0x154
   __might_sleep+0x54/0x84
   do_page_fault+0x1d4/0x42c
   do_mem_abort+0x4c/0xb0
   el1_abort+0x3c/0x5c
   el1h_64_sync_handler+0x4c/0xc4
   el1h_64_sync+0x64/0x68
   0xffffffc008000000
   __handle_sysrq+0x15c/0x184
   write_sysrq_trigger+0x94/0x128
   proc_reg_write+0xbc/0xec
   vfs_write+0xf0/0x2c8
   ksys_write+0x84/0xf0
   __arm64_sys_write+0x28/0x34
   invoke_syscall+0x4c/0x120
   el0_svc_common+0x94/0xfc
   do_el0_svc+0x38/0xc0
   el0_svc+0x2c/0x7c
   el0t_64_sync_handler+0x48/0x114
   el0t_64_sync+0x18c/0x190
  Unable to handle kernel execute from non-executable memory at
virtual address ffffffc008000000
  Mem abort info:
    ESR = 0x000000008600000f
    EC = 0x21: IABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x0f: level 3 permission fault
  swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000082863000
  [ffffffc008000000] pgd=100000027ffff003, p4d=100000027ffff003,
pud=100000027ffff003, pmd=100000027fffe003, pte=00680001001c3703
  Internal error: Oops: 8600000f [#1] PREEMPT SMP

I'm not sure if that's a sign that something is missing with your patch or not.

-Doug

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ