lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALvbMcAyh0cvx-465sBh-g9xX_B2nQ-7+5gmWEJ3-uG19faOig@mail.gmail.com>
Date: Thu, 21 Aug 2025 09:24:59 -0700
From: Andrew Pinski <andrew.pinski@....qualcomm.com>
To: Kees Cook <kees@...nel.org>
Cc: Qing Zhao <qing.zhao@...cle.com>, gcc-patches@....gnu.org,
        Joseph Myers <josmyers@...hat.com>, Richard Biener <rguenther@...e.de>,
        Jan Hubicka <hubicka@....cz>,
        Richard Earnshaw <richard.earnshaw@....com>,
        Richard Sandiford <richard.sandiford@....com>,
        Marcus Shawcroft <marcus.shawcroft@....com>,
        Kyrylo Tkachov <kyrylo.tkachov@....com>,
        Kito Cheng <kito.cheng@...il.com>, Palmer Dabbelt <palmer@...belt.com>,
        Andrew Waterman <andrew@...ive.com>,
        Jim Wilson <jim.wilson.gcc@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Dan Li <ashimida.1990@...il.com>, linux-hardening@...r.kernel.org
Subject: Re: [RFC PATCH 2/7] mangle: Introduce C typeinfo mangling API

On Thu, Aug 21, 2025 at 9:16 AM Kees Cook <kees@...nel.org> wrote:
>
> On Thu, Aug 21, 2025 at 12:59:06AM -0700, Andrew Pinski wrote:
> > On Thu, Aug 21, 2025 at 12:41 AM Kees Cook <kees@...nel.org> wrote:
> >
> > > To support the KCFI type-id which needs to convert unique function
> > > prototypes into unique 32-bit values, add a subset of the Itanium C++
> > > mangling ABI for C typeinfo of function prototypes. This gets us to the
> > > first step: a string representation of the function prototype.
> > >
> >
> > Can you explain why this is needed? Also it seems like the code is very
>
> As you suggest later, this could just twiddle hash bits instead of
> building up a string, but I left this as a C mangler in case it might
> be useful in the future, and for debugging. Trying to figure out why
> certain prototypes got mismatched was super frustrating, and looking at
> a hash value was pretty meaningless. :)
>
> > sensitive to buffer overflows. Especially the way it is currently written.
> > The C++ front-end version uses obstack_grow to overcome the buffer overflow
> > issue.
>
> Oh nice, yes, obstack looks perfect for this. (Or as you suggest below,
> std::string).
>
> If keeping the mangling string output is unwanted, maybe I could use
> obstack when the dumpfile is active and report the string+hash at the
> end of mangling, but make the actual output be the hash value?
>
> > > +    case BOOLEAN_TYPE:
> > > +      **p = 'b';
> > > +      (*p)++;
> > >
> >
> > I am not 100% sure this is always correct. because there could be a boolean
> > type with a precision non 1.
>
> AIUI, the non-1 case isn't possible on the C end of things and since
> this is strictly for C type representation, it's good as-is. But maybe
> I'm missing something?
>
> > > +      else
> > > +       {
> > > +         /* Fallback for other real types.  */
> > > +         *p += snprintf (*p, end - *p, "f%d", TYPE_PRECISION (type));
> > > +       }
> > > +      break;
> > >
> >
> > You definitely miss NULLPTR_TYPE.
>
> Isn't nullptr C++-only? I don't think I can create a C type that ends up
> getting internally mapped to NULLPTR_TYPE.

NO, it is part of C23.
`typeof(nullptr)` in C23 program will have a NULLPTR_TYPE.
I just posted some fixes for the C front-end in the area of
NULLPTR_TYPE which is why I know it exists.

>
> > > +
> > > +    default:
> > > +      /* Unknown builtin type - this should never happen in a well-formed
> > > C program.  */
> > > +      error ("mangle: Unknown builtin type with %<TREE_CODE%> %d",
> > > TREE_CODE (type));
> > > +      error ("mangle: %<TYPE_MODE%> = %d, %<TYPE_PRECISION%> = %d",
> > > TYPE_MODE (type), TYPE_PRECISION (type));
> > > +      error ("mangle: Please report this as a bug with the above
> > > diagnostic information");
> > > +      gcc_unreachable ();
> > >
> >
> > This is wrong way of doing this.
> > It should be a sorry or a fatal_error instead. Maybe with an inform
> > beforehand.
> > You might also want to use get_tree_code_name, GET_MODE_NAME instead of
> > printing out a number because the number tells nothing.
> > Maybe even use print_tree or print_generic_expr to print out the tree
> > instead of manually printing it.
>
> Ah, perfect; thanks!
>
> > > +    case ARRAY_TYPE:
> > > +      /* Array type: 'A' + size + '_' + element type (simplified).  */
> > > +      **p = 'A';
> > > +      (*p)++;
> > > +      if (TYPE_DOMAIN (type) && TYPE_MAX_VALUE (TYPE_DOMAIN (type)))
> > > +       {
> > > +         HOST_WIDE_INT size = tree_to_shwi (TYPE_MAX_VALUE (TYPE_DOMAIN
> > > (type))) + 1;
> > >
> >
> > No check to make sure array type is not a VLA. So this will ICE.
>
> Oops, yes. Thanks!
>
> > > +  else if (TREE_CODE (fntype_or_fndecl) == FUNCTION_DECL)
> > > +    {
> > > +      tree fndecl = fntype_or_fndecl;
> > > +      tree base_fntype = TREE_TYPE (fndecl);
> > > +
> > > +      /* For FUNCTION_DECL, build a synthetic function type using
> > > DECL_ARGUMENTS
> > > +        if available to preserve typedef information.  */
> > >
> >
> > Why do the building? Seems like you could just do that work here. Also
> > doesn't FUNCTION_DECL's type have exactly what you need?
>
> I need the function prototype in three places:
>
> - address-taken extern functions
> - function preambles
> - indirect call sites
>
> At indirect call sites (during the early GIMPLE pass), I had a
> FUNCTION_TYPE available that still had the full typedef information,
> and I could use it fine. For the other two, it's later on and the
> TREE_TYPE(fndecl)'s FUNCTION_TYPE had lost the typedef information (which
> I need to be able to examine in cases where the typedef name was needed
> for the mangling vs looking at the underlying types).

What typedef information do you need that is missing? struct names? or
something different?

So from the looks of it there is a big issue with attack vector where
you have 2 different structs with the same name. Seems like the hash
should have included the fields of the struct instead of just a
mangled name of the struct.  Can we break the hash for GCC's KFCI from
clang to include this extra hashing info? and maybe even fix up clang
later on?

Thanks,
Andrew Pinski

>
> > > +      tree parm = DECL_ARGUMENTS (fndecl);
> > > +      if (parm)
> > > +       {
> > > +         /* Build parameter type list from DECL_ARGUMENTS.  */
> > > +         tree param_list = NULL_TREE;
> > > +         tree *param_tail = &param_list;
> > > +
> > > +         for (; parm; parm = DECL_CHAIN (parm))
> > > +           {
> > > +             tree parm_type = TREE_TYPE (parm);
> > > +             *param_tail = tree_cons (NULL_TREE, parm_type, NULL_TREE);
> > > +             param_tail = &TREE_CHAIN (*param_tail);
> > > +           }
> > > +
> > > +         /* Add void_type_node sentinel if the function takes no
> > > parameters.  */
> > > +         if (!param_list)
> > > +           param_list = tree_cons (NULL_TREE, void_type_node, NULL_TREE);
> > > +
> > > +         /* Build synthetic function type with preserved parameter
> > > types.  */
> > > +         fntype = build_function_type (TREE_TYPE (base_fntype),
> > > param_list);
> > > +       }
> > > +      else
> > > +       {
> > > +         /* No DECL_ARGUMENTS - use the standard function type.  */
> > > +         fntype = base_fntype;
> > > +       }
> > > +    }
> > > +  else
> > > +    {
> > > +      /* Must only be called with FUNCTION_DECL or FUNCTION_TYPE.  */
> > > +      gcc_unreachable ();
> > > +    }
>
> Thanks for looking through all this!
>
> -Kees
>
> --
> Kees Cook

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ