[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202508221149.F19A56772@keescook>
Date: Fri, 22 Aug 2025 12:02:01 -0700
From: Kees Cook <kees@...nel.org>
To: Qing Zhao <qing.zhao@...cle.com>
Cc: Andrew Pinski <andrew.pinski@....qualcomm.com>,
"gcc-patches@....gnu.org" <gcc-patches@....gnu.org>,
Joseph Myers <josmyers@...hat.com>,
Richard Biener <rguenther@...e.de>, Jan Hubicka <hubicka@....cz>,
Richard Earnshaw <richard.earnshaw@....com>,
Richard Sandiford <richard.sandiford@....com>,
Marcus Shawcroft <marcus.shawcroft@....com>,
Kyrylo Tkachov <kyrylo.tkachov@....com>,
Kito Cheng <kito.cheng@...il.com>,
Palmer Dabbelt <palmer@...belt.com>,
Andrew Waterman <andrew@...ive.com>,
Jim Wilson <jim.wilson.gcc@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
Dan Li <ashimida.1990@...il.com>,
"linux-hardening@...r.kernel.org" <linux-hardening@...r.kernel.org>
Subject: Re: [RFC PATCH 2/7] mangle: Introduce C typeinfo mangling API
On Fri, Aug 22, 2025 at 03:11:16PM +0000, Qing Zhao wrote:
> > On Aug 21, 2025, at 17:29, Kees Cook <kees@...nel.org> wrote:
> > For non-static functions, we cannot know if other compilation units may
> > make indirect calls to a given function, so those functions must always
> > have their kcfi preamble added. For static functions, if they are
> > address-taken by the current compilation unit, then they must get a kcfi
> > preamble added.
>
> Oh, yeah, I see. without lto or whole-program-mode, we cannot determine
> whether a extern function is address taken or not. Therefore, we have to
> treat ALL extern functions conservatively as address taken.
>
> So, from my understanding, the complete list that need to compute the typeid from the function prototype is:
>
> - At indirect call sites
> - all indirect call sites; (At the call site)
> - At function preambles
> - all address-taken static functions (At the function definition)
> - all extern functions (At function declaration or function definition?? Please see my question below)
For "extern functions", the logic is split as:
- "all extern function definitions get preamble"
- "all extern function declarations without a definition that are
address-taken get __kcfi_typeid_ symbol"
> > The other case is emitting the __ckfi_typeid_FUNC weak symbols, which is
> > used for link-time resolution with non-C code (i.e. raw .S assembly)
> > which doesn't have access to the C type system to calculate the hashes
> > on its own, and needs to have a way to build its own kcfi preambles.
>
> So, for such functions, there should be an extern function declaration in the C code.
> But the definition of such function is not available in the C code we are compiling.
> Therefore the weak __ckfi_typeid_FUNC symbol is emitted at the function declaration
> point for such function when we compile the C code?
>
> And the typeid (the hash value) for such routine is computed at the function declaration
> point too.
>
> Is the above understanding correct?
Correct, the kcfi_typeid symbol and value are emitted at function
declaration point, but only if such function is address-taken.
> Then for the other extern function whose definition is in the C code of other modules that might
> be compiled later, should the typeid is computed at the declaration or the definition?
It is computed and emitted just for externs that are address-taken.
> > This
> > is how Linux constructs its assembly function entry points:
> >
> > #ifndef __CFI_TYPE
> > #define __CFI_TYPE(name) \
> > .4byte __kcfi_typeid_##name
> > #endif
> >
> > #define SYM_TYPED_ENTRY(name, linkage, align...) \
> > linkage(name) ASM_NL \
> > align ASM_NL \
> > __CFI_TYPE(name) ASM_NL \
> > name:
> >
> > That way all the asm functions can be be indirect call targets without
> > knowing the hash value (which will be filled in at link time).
>
> Okay. I see. This is the case for the extern function whose definition is in the assembly file. (Not available in
> the C code)
Right, and sometimes we have to explicitly perform a no-op
address-taking to make sure a symbol gets generated:
/*
* Force the compiler to emit 'sym' as a symbol, so that we can reference
* it from inline assembler. Necessary in case 'sym' could be inlined
* otherwise, or eliminated entirely due to lack of references that are
* visible to the compiler.
*/
#define ___ADDRESSABLE(sym, __attrs) \
static void * __used __attrs \
__UNIQUE_ID(__PASTE(__addressable_,sym)) = (void *)(uintptr_t)&sym;
#define __ADDRESSABLE(sym) \
___ADDRESSABLE(sym, __section(".discard.addressable"))
$ git grep KCFI_REFERENCE
include/linux/compiler.h:#define KCFI_REFERENCE(sym) __ADDRESSABLE(sym)
arch/x86/include/asm/page_64.h:KCFI_REFERENCE(copy_page);
arch/x86/include/asm/string_64.h:KCFI_REFERENCE(__memset);
arch/x86/include/asm/string_64.h:KCFI_REFERENCE(__memmove);
arch/x86/kernel/alternative.c:KCFI_REFERENCE(__bpf_prog_runX);
arch/x86/kernel/alternative.c:KCFI_REFERENCE(__bpf_callback_fn);
> > I assume I just didn't see how yet. :) I wasn't able to identify nor
> > store the typeid for function definitions that ultimately end up getting
> > .s file output.
> So, the problem only exists for the external functions whose definition is NOT in the C code?
Yup!
> > I couldn't figure out how to find these during the GIMPLE pass. Oh,
> > perhaps I can do this with an IPA pass? That should let me walk all
> > functions including externs. I'll give it a try...
Adding the IPA pass to find all functions worked perfectly. I was able
to remove all the weird DECL reconstruction and just use the original
FUNCTION_TYPE info for the typeids.
-Kees
--
Kees Cook
Powered by blists - more mailing lists