[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202509181218.FA966DA8F0@keescook>
Date: Thu, 18 Sep 2025 12:20:36 -0700
From: Kees Cook <kees@...nel.org>
To: Qing Zhao <qing.zhao@...cle.com>
Cc: Andrew Pinski <pinskia@...il.com>, Jakub Jelinek <jakub@...hat.com>,
Martin Uecker <uecker@...raz.at>,
Richard Biener <rguenther@...e.de>,
Joseph Myers <josmyers@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Jan Hubicka <hubicka@....cz>,
Richard Earnshaw <richard.earnshaw@....com>,
Richard Sandiford <richard.sandiford@....com>,
Marcus Shawcroft <marcus.shawcroft@....com>,
Kyrylo Tkachov <kyrylo.tkachov@....com>,
Kito Cheng <kito.cheng@...il.com>,
Palmer Dabbelt <palmer@...belt.com>,
Andrew Waterman <andrew@...ive.com>,
Jim Wilson <jim.wilson.gcc@...il.com>,
Dan Li <ashimida.1990@...il.com>,
Sami Tolvanen <samitolvanen@...gle.com>,
Ramon de C Valle <rcvalle@...gle.com>,
Joao Moreira <joao@...rdrivepizza.com>,
Nathan Chancellor <nathan@...nel.org>,
Bill Wendling <morbo@...gle.com>,
"gcc-patches@....gnu.org" <gcc-patches@....gnu.org>,
"linux-hardening@...r.kernel.org" <linux-hardening@...r.kernel.org>
Subject: Re: [PATCH v3 2/7] kcfi: Add core Kernel Control Flow Integrity
infrastructure
On Thu, Sep 18, 2025 at 06:48:03PM +0000, Qing Zhao wrote:
>
>
> > On Sep 18, 2025, at 14:20, Kees Cook <kees@...nel.org> wrote:
> >
> >>>>> +- External functions that are address-taken have a weak __kcfi_typeid_$func
> >>>>> + symbol added with the typeid value available so that the typeid can be
> >>>>> + referenced from assembly linkages, etc, where the typeid values cannot be
> >>>>> + calculated (i.e where C type information is missing):
> >>>>> +
> >>>>> + .weak __kcfi_typeid_$func
> >>>>> + .set __kcfi_typeid_$func, $typeid
> >>>>> +
> >>>>
> >>>> From my previous understanding, the above weak symbol is emitted for external functions
> >>>> that are address-taken AND does not have a definition in the compilation. So the weak symbols
> >>>> Is emitted at the declaration site of the external function, is this true?
> >>>>
> >>>> If so, could you please clarify this in the above?
> >>>
> >>> Yes, this happens via assemble_external_real, which can be called under
> >>> a few conditions in gcc/varasm.cc.
> >>
> >> Okay. Please clarify this in the design doc.
> >
> > I mention it later in the "behavioral" section:
> >
> > - assemble_external_real calls kcfi_emit_typeid_symbol to add the
> > __kcfi_typeid_$func symbols.
> >
> > I had left off implementation details (i.e. "called from
> > assemble_external_real") in the "constraints" section. How would you
> > like this arranged?
>
> The original arrangement is good. -:)
>
> I guess that I didn’t make myself clear in the beginning, the following is a modified version of
> your previous paragraph:
>
> +- An external function that is address-taken but does not have a definition has
> + a weak __kcfi_typeid_$func symbol added at the declaration site. This weak
> + symbol has the typeid value available so that the typeid can be
> + referenced from assembly linkages, etc, where the typeid values cannot be
> + calculated (i.e where C type information is missing):
> +
> + .weak __kcfi_typeid_$func
> + .set __kcfi_typeid_$func, $typeid
> +
>
> Is the above the correct understanding?
Ah! I see, yes, that's correct. I will update it. :)
>
> >>>
> >>>>> +static uint32_t
> >>>>> +kcfi_get_type_id (tree fn_type)
> >>>>> +{
> >>>>> + uint32_t type_id;
> >>>>> +
> >>>>> + /* Cache the attribute identifier. */
> >>>>> + if (!kcfi_type_id_attr)
> >>>>> + kcfi_type_id_attr = get_identifier ("kcfi_type_id");
> >>>>> +
> >>>>> + tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
> >>>>> + TYPE_ATTRIBUTES (fn_type));
> >>>>
> >>>> The above can be simplified as:
> >>>> + tree attr = lookup_attribute (“kcfi_type_id”, TYPE_ATTRIBUTES (fn_type));
> >>>
> >>> Ugh, I totally misunderstood the examples I saw of this. I thought they
> >>> were caching the string lookup, but now that I look more closely, I see:
> >>>
> >>> #define IDENTIFIER_POINTER(NODE) \
> >>> ((const char *) IDENTIFIER_NODE_CHECK (NODE)->identifier.id.str)
> >>>
> >>> it's just returning the string!
> >>>
> >>> I will throw away the "caching" I was doing. I thought it would actually
> >>> look up the attribute using the tree returned by get_identifier, but I
> >>> see there is no overloaded lookup_attribute that takes a tree argument.
> >>>
> >>> *face palm*
> >>
> >> -:)
> >
> > Okay, so I tried to remove this and remembered that it's actually cached
> > not for lookup_attribute, but for build_tree_list call case:
> >
> > tree attr = build_tree_list (kcfi_type_id_attr, type_id_tree);
> >
> > TYPE_ATTRIBUTES (fn_type) = chainon (TYPE_ATTRIBUTES (fn_type), attr);
> >
> > For _that_, I need a "tree" argument. So instead of building it each
> > time, I have it built already, and I can get at its string for
> > lookup_attribute too. So I think this code is good as-is.
>
> Right, the kcfi_type_id_attr is still needed for the purpose of new type_id attribute.
>
> But, for the following
>
> > + tree attr = lookup_attribute (IDENTIFIER_POINTER (kcfi_type_id_attr),
> > + TYPE_ATTRIBUTES (fn_type));
>
> The above can be simplified as:
> + tree attr = lookup_attribute (“kcfi_type_id”, TYPE_ATTRIBUTES (fn_type));
>
> No need to call IDENTIFIER_POINTER (kcfi_type_id_attr) as the first argument for the above call.
>
> Hope this is clear.
Right, I did this because it seemed weird to me to open-code the same
literal string twice.
--
Kees Cook
Powered by blists - more mailing lists