linux-kernel - Re: [PATCH] arm64: Clear OS lock in enable_debug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220610063619.7921-1-mark-pk.tsai@mediatek.com>
Date:   Fri, 10 Jun 2022 14:36:19 +0800
From:   Mark-PK Tsai <mark-pk.tsai@...iatek.com>
To:     <will@...nel.org>, <stable@...r.kernel.org>
CC:     <alexandru.elisei@....com>, <catalin.marinas@....com>,
        <jean-philippe.brucker@....com>,
        <linux-arm-kernel@...ts.infradead.org>,
        <linux-kernel@...r.kernel.org>,
        <linux-mediatek@...ts.infradead.org>, <mark-pk.tsai@...iatek.com>,
        <matthias.bgg@...il.com>, <maz@...nel.org>,
        <yj.chiang@...iatek.com>
Subject: Re: [PATCH] arm64: Clear OS lock in enable_debug_monitors

> On Thu, Jun 09, 2022 at 11:33:18AM +0800, Mark-PK Tsai wrote:
> > Always clear OS lock before enable debug event.
> > 
> > The OS lock is clear in cpuhp ops in recent kernel,
> > but when the debug exception happened before it
> > kernel might crash because debug event enable didn't
> > take effect when OS lock is hold.
> > 
> > Below is the use case that having this problem:
> > 
> > Register kprobe in console_unlock and kernel will
> > panic at secondary_start_kernel on secondary core.
> > 
> > CPU: 1 PID: 0 Comm: swapper/1 Tainted: P
> > ...
> > pstate: 004001c5 (nzcv dAIF +PAN -UAO)
> > pc : do_undefinstr+0x5c/0x60
> > lr : do_undefinstr+0x2c/0x60
> > sp : ffffffc01338bc50
> > pmr_save: 000000f0
> > x29: ffffffc01338bc50 x28: ffffff8115e95a00 T
> > x27: ffffffc01258e000 x26: ffffff8115e95a00
> > x25: 00000000ffffffff x24: 0000000000000000
> > x23: 00000000604001c5 x22: ffffffc014015008
> > x21: 000000002232f000 x20: 00000000000000f0 j
> > x19: ffffffc01338bc70 x18: ffffffc0132ed040
> > x17: ffffffc01258eb48 x16: 0000000000000403 L&
> > x15: 0000000000016480 x14: ffffffc01258e000 i/
> > x13: 0000000000000006 x12: 0000000000006985
> > x11: 00000000d5300000 x10: 0000000000000000
> > x9 : 9f6c79217a8a0400 x8 : 00000000000000c5
> > x7 : 0000000000000000 x6 : ffffffc01338bc08 2T
> > x5 : ffffffc01338bc08 x4 : 0000000000000002
> > x3 : 0000000000000000 x2 : 0000000000000004
> > x1 : 0000000000000000 x0 : 0000000000000001 *q
> > Call trace:
> >  do_undefinstr+0x5c/0x60
> >  el1_undef+0x10/0xb4
> >  0xffffffc014015008
> >  vprintk_func+0x210/0x290
> >  printk+0x64/0x90
> >  cpuinfo_detect_icache_policy+0x80/0xe0
> >  __cpuinfo_store_cpu+0x150/0x160
> >  secondary_start_kernel+0x154/0x440
> > 
> > The root cause is that OS_LSR_EL1.OSLK is reset
> > to 1 on a cold reset[1] and the firmware didn't
> > unlock it by default.
> > So the core didn't go to el1_dbg as expected after
> > kernel_enable_single_step and eret.
> 
> Hmm, I thought we didn't use hardware single-step for kprobes after
> 7ee31a3aa8f4 ("arm64: kprobes: Use BRK instead of single-step when executing
> instructions out-of-line"). What is triggering this exception?
> 
> Will

You're right.
Actually this issue happend in 5.4 LTS, and the commit you mentioned
can avoid the kernel panic by not using hardware single-step.

I think 5.4 LTS should apply this commit.

7ee31a3aa8f4 ("arm64: kprobes: Use BRK instead of single-step when executing instructions out-of-line")

Cc: stable@...r.kernel.org



And I'm not sure if there is other use case may have problem if the
kernel don't clear OS lock in enable_debug_monitors everytime.
So should we do this to prevent someone face the similar issue?