linux-kernel - Re: [PATCH 2/2] x86/retpoline,kprobes: Avoid treating rethunk as an indirect jump

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230707233913.976ecfb1a312d03f9b07a2b2@kernel.org>
Date:   Fri, 7 Jul 2023 23:39:13 +0900
From:   Masami Hiramatsu (Google) <mhiramat@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Petr Pavlu <petr.pavlu@...e.com>, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        hpa@...or.com, samitolvanen@...gle.com, x86@...nel.org,
        linux-trace-kernel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] x86/retpoline,kprobes: Avoid treating rethunk as an
 indirect jump

On Thu, 6 Jul 2023 13:34:03 +0200
Peter Zijlstra <peterz@...radead.org> wrote:

> On Thu, Jul 06, 2023 at 06:00:14PM +0900, Masami Hiramatsu wrote:
> > On Thu, 6 Jul 2023 09:17:05 +0200
> > Peter Zijlstra <peterz@...radead.org> wrote:
> > 
> > > On Thu, Jul 06, 2023 at 09:47:23AM +0900, Masami Hiramatsu wrote:
> > > 
> > > > > > If I understand correctly, all indirect jump will be replaced with JMP_NOSPEC.
> > > > > > If you read the insn_jump_into_range, I onlu jecks the jump code, not call.
> > > > > > So the functions only have indirect call still allow optprobe.
> > > > > 
> > > > > With the introduction of kCFI JMP_NOSPEC is no longer an equivalent to a
> > > > > C indirect jump.
> > > > 
> > > > If I understand correctly, kCFI is enabled by CFI_CLANG, and clang is not
> > > > using jump-tables by default, so we can focus on gcc. In that case
> > > > current check still work, correct?
> > > 
> > > IIRC clang can use jump tables, but like GCC needs RETPOLINE=n and
> > > IBT=n, so effectively nobody has them.
> > 
> > So if it requires RETPOLINE=n, current __indirect_thunk_start/end checking
> > is not required, right? (that code is embraced with "#ifdef CONFIG_RETPOLINE")
> 
> Correct.
> 
> > > 
> > > The reason I did mention kCFI though is that kCFI has a larger 'indirect
> > > jump' sequence, and I'm not sure we've thought about what can go
> > > sideways if that's optprobed.
> > 
> > If I understand correctly, kCFI checks only indirect function call (check
> > pointer), so no jump tables. Or does it use indirect 'jump' ?
> 
> Yes, it's indirect function calls only.
> 
> Imagine our function (bar) doing an indirect call, it will (as clang
> always does) have the function pointer in r11:
> 
> bar:
> 	...
> 	movl	$(-0x12345678),%r10d
> 	addl	-15(%r11), %r10d
> 	je	1f
> 	ud2
> 1:	call	__x86_indirect_thunk_r11
> 
> 
> 
> And then the function it calls (foo) looks like:
> 
> __cfi_foo:
> 	movl	$0x12345678, %eax
> 	.skip	11, 0x90
> foo:
> 	endbr
> 	....
> 
> 
> 
> So if the caller (in bar) and the callee (foo) have the same hash value
> (0x12345678 in this case) then it will be equal and we continue on our
> merry way.
> 
> However, if they do not match, we'll trip that #UD and the
> handle_cfi_failure() will try and match the address to
> __{start,stop}__kcfi_traps[]. Additinoally decode_cfi_insn() will try
> and decode that whole call sequence in order to obtain the target
> address and typeid (hash).

Thank you for the explanation! This helps me!

> 
> optprobes might disturb this code.

So either optprobe or kprobes (any text instrumentation) do not touch
__cfi_FUNC symbols light before FUNC.

> 
> > > I suspect the UD2 that's in there will go 'funny' if it's relocated into
> > > an optprobe, as in, it'll not be recognised as a CFI fail.
> > 
> > UD2 can't be optprobed (kprobe neither) because it can change the dumped
> > BUG address...
> 
> Right, same problem here. But could the movl/addl be opt-probed? That
> would wreck decode_cfi_insn(). Then again, if decode_cfi_insn() fails,
> we'll get report_cfi_failure_noaddr(), which is less informative.

Ok, so if that sequence is always expected, I can also prohibit probing it.
Or, maybe it is better to generalize the API to access original instruction
which is used from kprobes, so that decode_cfi_insn() can get the original
(non-probed) insn.

> 
> So it looks like nothing too horrible happens...
> 
> 


Thank you,

-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>