lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK7LNASjR=1S_H+r=QEX5DbkQ4MJ4-MMtB-UQWAvCeSbhbMOqA@mail.gmail.com>
Date: Sun, 18 Aug 2024 15:40:36 +0900
From: Masahiro Yamada <masahiroy@...nel.org>
To: Kris Van Hees <kris.van.hees@...cle.com>
Cc: linux-kernel@...r.kernel.org, linux-kbuild@...r.kernel.org, 
	linux-modules@...r.kernel.org, linux-trace-kernel@...r.kernel.org, 
	Nick Alcock <nick.alcock@...cle.com>, Alan Maguire <alan.maguire@...cle.com>, 
	Steven Rostedt <rostedt@...dmis.org>, Luis Chamberlain <mcgrof@...nel.org>, 
	Masami Hiramatsu <mhiramat@...nel.org>, Nick Desaulniers <ndesaulniers@...gle.com>, 
	Jiri Olsa <olsajiri@...il.com>, Elena Zannoni <elena.zannoni@...cle.com>
Subject: Re: [PATCH v6 3/4] scripts: add verifier script for builtin module
 range data

On Fri, Aug 16, 2024 at 12:04 AM Kris Van Hees <kris.van.hees@...cle.com> wrote:
>
> The modules.builtin.ranges offset range data for builtin modules is
> generated at compile time based on the list of built-in modules and
> the vmlinux.map and vmlinux.o.map linker maps.  This data can be used
> to determine whether a symbol at a particular address belongs to
> module code that was configured to be compiled into the kernel proper
> as a built-in module (rather than as a standalone module).
>
> This patch adds a script that uses the generated modules.builtin.ranges
> data to annotate the symbols in the System.map with module names if
> their address falls within a range that belongs to one or more built-in
> modules.
>
> It then processes the vmlinux.map (and if needed, vmlinux.o.map) to
> verify the annotation:
>
>   - For each top-level section:
>      - For each object in the section:
>         - Determine whether the object is part of a built-in module
>           (using modules.builtin and the .*.cmd file used to compile
>            the object as suggested in [0])
>         - For each symbol in that object, verify that the built-in
>           module association (or lack thereof) matches the annotation
>           given to the symbol.
>
> Signed-off-by: Kris Van Hees <kris.van.hees@...cle.com>
> Reviewed-by: Nick Alcock <nick.alcock@...cle.com>
> Reviewed-by: Alan Maguire <alan.maguire@...cle.com>
> ---
>     Changes since v5:
>      - Added optional 6th argument to specify kernel build directory.
>      - Report error and exit if .*.o.cmd files cannot be read.
>
>     Changes since v4:
>      - New patch in the series
> ---
>  scripts/verify_builtin_ranges.awk | 365 ++++++++++++++++++++++++++++++
>  1 file changed, 365 insertions(+)
>  create mode 100755 scripts/verify_builtin_ranges.awk
>
> diff --git a/scripts/verify_builtin_ranges.awk b/scripts/verify_builtin_ranges.awk
> new file mode 100755
> index 000000000000..b82cf0a0fbeb
> --- /dev/null
> +++ b/scripts/verify_builtin_ranges.awk
> @@ -0,0 +1,365 @@
> +#!/usr/bin/gawk -f
> +# SPDX-License-Identifier: GPL-2.0
> +# verify_builtin_ranges.awk: Verify address range data for builtin modules
> +# Written by Kris Van Hees <kris.van.hees@...cle.com>
> +#
> +# Usage: verify_builtin_ranges.awk modules.builtin.ranges System.map \
> +#                                 modules.builtin vmlinux.map vmlinux.o.map \
> +#                                 [ <build-dir> ]
> +#
> +
> +# Return the module name(s) (if any) associated with the given object.
> +#
> +# If we have seen this object before, return information from the cache.
> +# Otherwise, retrieve it from the corresponding .cmd file.
> +#
> +function get_module_info(fn, mod, obj, mfn, s) {
> +       if (fn in omod)
> +               return omod[fn];
> +
> +       if (match(fn, /\/[^/]+$/) == 0)
> +               return "";
> +
> +       obj = fn;
> +       mod = "";
> +       mfn = "";
> +       fn = kdir "/" substr(fn, 1, RSTART) "." substr(fn, RSTART + 1) ".cmd";
> +       if (getline s <fn == 1) {
> +               if (match(s, /DKBUILD_MODFILE=['"]+[^'"]+/) > 0) {
> +                       mfn = substr(s, RSTART + 16, RLENGTH - 16);
> +                       gsub(/['"]/, "", mfn);
> +
> +                       mod = mfn;
> +                       gsub(/([^/ ]*\/)+/, "", mod);
> +                       gsub(/-/, "_", mod);
> +               }
> +       } else {
> +               print "ERROR: Failed to read: " fn "\n\n" \
> +                     "  Invalid kernel build directory (" kdir ")\n" \
> +                     "  or its content does not match " ARGV[1] >"/dev/stderr";
> +               close(fn);
> +               total = 0;
> +               exit(1);
> +       }
> +       close(fn);
> +
> +       # A single module (common case) also reflects objects that are not part
> +       # of a module.  Some of those objects have names that are also a module
> +       # name (e.g. core).  We check the associated module file name, and if
> +       # they do not match, the object is not part of a module.
> +       if (mod !~ / /) {
> +               if (!(mod in mods))
> +                       return "";
> +               if (mods[mod] != mfn)
> +                       return "";
> +       }
> +
> +       # At this point, mod is a single (valid) module name, or a list of
> +       # module names (that do not need validation).
> +       omod[obj] = mod;
> +       close(fn);
> +
> +       return mod;
> +}
>



This code is copy-paste from scripts/generate_builtin_ranges.awk
So, my comments in 2/4 can apply to this patch, too.


Instead of adding a separate script,
we could add a "verify mode" option.


 scripts/generate_builtin_ranges.awk --verify ...


But, I do not know how much cleaner it will become.

I am not good at reviewing AWK code, but this
is how you go.




If this script were written in Python,
it would be easy and readable to
split logically-related code chunks into functions,
as follows:


def parse_module_builtin():
    ...


def parse_vmlinux_map_lld():
    ...


def parse_vmlinux_map_bfd():
    ...


def parse_vmlinux_o_map():
    ...



--
Best Regards
Masahiro Yamada

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ