linux-kernel - Re: [PATCH v2 08/19] gendwarfksyms: Expand subroutine

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9978884e-87c8-4c20-b9ff-b4571bce01ce@suse.com>
Date: Tue, 3 Sep 2024 17:11:57 +0200
From: Petr Pavlu <petr.pavlu@...e.com>
To: Sami Tolvanen <samitolvanen@...gle.com>
Cc: Masahiro Yamada <masahiroy@...nel.org>,
 Luis Chamberlain <mcgrof@...nel.org>, Miguel Ojeda <ojeda@...nel.org>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Matthew Maurer <mmaurer@...gle.com>, Alex Gaynor <alex.gaynor@...il.com>,
 Wedson Almeida Filho <wedsonaf@...il.com>, Gary Guo <gary@...yguo.net>,
 Petr Pavlu <petr.pavlu@...e.com>, Neal Gompa <neal@...pa.dev>,
 Hector Martin <marcan@...can.st>, Janne Grunau <j@...nau.net>,
 Asahi Linux <asahi@...ts.linux.dev>, linux-kbuild@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-modules@...r.kernel.org,
 rust-for-linux@...r.kernel.org
Subject: Re: [PATCH v2 08/19] gendwarfksyms: Expand subroutine_type

On 8/15/24 19:39, Sami Tolvanen wrote:
> Add support for expanding DW_TAG_subroutine_type and the parameters
> in DW_TAG_formal_parameter. Use this to also expand subprograms.
> 
> Example output with --debug:
> 
>   subprogram(
>     formal_parameter base_type usize byte_size(8),
>     formal_parameter base_type usize byte_size(8),
>   )
>   -> base_type void;
> 
> Signed-off-by: Sami Tolvanen <samitolvanen@...gle.com>
> ---
>  scripts/gendwarfksyms/dwarf.c         | 57 ++++++++++++++++++++++++++-
>  scripts/gendwarfksyms/gendwarfksyms.h |  1 +
>  2 files changed, 57 insertions(+), 1 deletion(-)
> 
> diff --git a/scripts/gendwarfksyms/dwarf.c b/scripts/gendwarfksyms/dwarf.c
> index 82185737fa2a..c81652426be8 100644
> --- a/scripts/gendwarfksyms/dwarf.c
> +++ b/scripts/gendwarfksyms/dwarf.c
> [...]
>  
> +static int __process_subroutine_type(struct state *state, struct die *cache,
> +				     Dwarf_Die *die, const char *type)
> +{
> +	check(process(state, cache, type));
> +	check(process(state, cache, "("));
> +	check(process_linebreak(cache, 1));
> +	/* Parameters */
> +	check(process_die_container(state, cache, die, process_type,
> +				    match_formal_parameter_type));
> +	check(process_linebreak(cache, -1));
> +	check(process(state, cache, ")"));
> +	process_linebreak(cache, 0);
> +	/* Return type */
> +	check(process(state, cache, "-> "));
> +	return check(process_type_attr(state, cache, die));
> +}

If I understand correctly, this formatting logic also affects the
symtypes output. Looking at its format, I would like to propose a few
minor changes.

Example of the current symtypes output:
kprobe_event_cmd_init subprogram( formal_parameter pointer_type <unnamed> { s#dynevent_cmd } byte_size(8), formal_parameter pointer_type <unnamed> { base_type char byte_size(1) encoding(8) } byte_size(8), formal_parameter base_type int byte_size(4) encoding(5),  ) -> base_type void

Proposed changes:
kprobe_event_cmd_init subprogram ( formal_parameter pointer_type <unnamed> { s#dynevent_cmd } byte_size(8) , formal_parameter pointer_type <unnamed> { base_type char byte_size(1) encoding(8) } byte_size(8) , formal_parameter base_type int byte_size(4) encoding(5) ) -> base_type void
                                ^- (1)                                                                    ^- (2)                                                                                                                                                       ^- (3)

(1) "subprogram(" is split to "subprogram (".
(2) A space is added prior to ",".
(3) String ", " is removed after the last parameter.

Separating each token with a whitespace matches the current genksyms
format, makes the data trivially parsable and easy to pretty-print by
additional tools. If some tokens are merged, as in "subprogram(", then
such a tool needs to have extra logic to parse each word and split it
into tokens.

For attributes with one value, such as "byte_size(4)", I think the
current format is probably ok.

-- 
Thanks,
Petr