[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK7LNASjR=1S_H+r=QEX5DbkQ4MJ4-MMtB-UQWAvCeSbhbMOqA@mail.gmail.com>
Date: Sun, 18 Aug 2024 15:40:36 +0900
From: Masahiro Yamada <masahiroy@...nel.org>
To: Kris Van Hees <kris.van.hees@...cle.com>
Cc: linux-kernel@...r.kernel.org, linux-kbuild@...r.kernel.org,
linux-modules@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
Nick Alcock <nick.alcock@...cle.com>, Alan Maguire <alan.maguire@...cle.com>,
Steven Rostedt <rostedt@...dmis.org>, Luis Chamberlain <mcgrof@...nel.org>,
Masami Hiramatsu <mhiramat@...nel.org>, Nick Desaulniers <ndesaulniers@...gle.com>,
Jiri Olsa <olsajiri@...il.com>, Elena Zannoni <elena.zannoni@...cle.com>
Subject: Re: [PATCH v6 3/4] scripts: add verifier script for builtin module
range data
On Fri, Aug 16, 2024 at 12:04 AM Kris Van Hees <kris.van.hees@...cle.com> wrote:
>
> The modules.builtin.ranges offset range data for builtin modules is
> generated at compile time based on the list of built-in modules and
> the vmlinux.map and vmlinux.o.map linker maps. This data can be used
> to determine whether a symbol at a particular address belongs to
> module code that was configured to be compiled into the kernel proper
> as a built-in module (rather than as a standalone module).
>
> This patch adds a script that uses the generated modules.builtin.ranges
> data to annotate the symbols in the System.map with module names if
> their address falls within a range that belongs to one or more built-in
> modules.
>
> It then processes the vmlinux.map (and if needed, vmlinux.o.map) to
> verify the annotation:
>
> - For each top-level section:
> - For each object in the section:
> - Determine whether the object is part of a built-in module
> (using modules.builtin and the .*.cmd file used to compile
> the object as suggested in [0])
> - For each symbol in that object, verify that the built-in
> module association (or lack thereof) matches the annotation
> given to the symbol.
>
> Signed-off-by: Kris Van Hees <kris.van.hees@...cle.com>
> Reviewed-by: Nick Alcock <nick.alcock@...cle.com>
> Reviewed-by: Alan Maguire <alan.maguire@...cle.com>
> ---
> Changes since v5:
> - Added optional 6th argument to specify kernel build directory.
> - Report error and exit if .*.o.cmd files cannot be read.
>
> Changes since v4:
> - New patch in the series
> ---
> scripts/verify_builtin_ranges.awk | 365 ++++++++++++++++++++++++++++++
> 1 file changed, 365 insertions(+)
> create mode 100755 scripts/verify_builtin_ranges.awk
>
> diff --git a/scripts/verify_builtin_ranges.awk b/scripts/verify_builtin_ranges.awk
> new file mode 100755
> index 000000000000..b82cf0a0fbeb
> --- /dev/null
> +++ b/scripts/verify_builtin_ranges.awk
> @@ -0,0 +1,365 @@
> +#!/usr/bin/gawk -f
> +# SPDX-License-Identifier: GPL-2.0
> +# verify_builtin_ranges.awk: Verify address range data for builtin modules
> +# Written by Kris Van Hees <kris.van.hees@...cle.com>
> +#
> +# Usage: verify_builtin_ranges.awk modules.builtin.ranges System.map \
> +# modules.builtin vmlinux.map vmlinux.o.map \
> +# [ <build-dir> ]
> +#
> +
> +# Return the module name(s) (if any) associated with the given object.
> +#
> +# If we have seen this object before, return information from the cache.
> +# Otherwise, retrieve it from the corresponding .cmd file.
> +#
> +function get_module_info(fn, mod, obj, mfn, s) {
> + if (fn in omod)
> + return omod[fn];
> +
> + if (match(fn, /\/[^/]+$/) == 0)
> + return "";
> +
> + obj = fn;
> + mod = "";
> + mfn = "";
> + fn = kdir "/" substr(fn, 1, RSTART) "." substr(fn, RSTART + 1) ".cmd";
> + if (getline s <fn == 1) {
> + if (match(s, /DKBUILD_MODFILE=['"]+[^'"]+/) > 0) {
> + mfn = substr(s, RSTART + 16, RLENGTH - 16);
> + gsub(/['"]/, "", mfn);
> +
> + mod = mfn;
> + gsub(/([^/ ]*\/)+/, "", mod);
> + gsub(/-/, "_", mod);
> + }
> + } else {
> + print "ERROR: Failed to read: " fn "\n\n" \
> + " Invalid kernel build directory (" kdir ")\n" \
> + " or its content does not match " ARGV[1] >"/dev/stderr";
> + close(fn);
> + total = 0;
> + exit(1);
> + }
> + close(fn);
> +
> + # A single module (common case) also reflects objects that are not part
> + # of a module. Some of those objects have names that are also a module
> + # name (e.g. core). We check the associated module file name, and if
> + # they do not match, the object is not part of a module.
> + if (mod !~ / /) {
> + if (!(mod in mods))
> + return "";
> + if (mods[mod] != mfn)
> + return "";
> + }
> +
> + # At this point, mod is a single (valid) module name, or a list of
> + # module names (that do not need validation).
> + omod[obj] = mod;
> + close(fn);
> +
> + return mod;
> +}
>
This code is copy-paste from scripts/generate_builtin_ranges.awk
So, my comments in 2/4 can apply to this patch, too.
Instead of adding a separate script,
we could add a "verify mode" option.
scripts/generate_builtin_ranges.awk --verify ...
But, I do not know how much cleaner it will become.
I am not good at reviewing AWK code, but this
is how you go.
If this script were written in Python,
it would be easy and readable to
split logically-related code chunks into functions,
as follows:
def parse_module_builtin():
...
def parse_vmlinux_map_lld():
...
def parse_vmlinux_map_bfd():
...
def parse_vmlinux_o_map():
...
--
Best Regards
Masahiro Yamada
Powered by blists - more mailing lists