lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZkTnwwyTF0WSMmqI@oracle.com>
Date: Wed, 15 May 2024 12:50:11 -0400
From: Kris Van Hees <kris.van.hees@...cle.com>
To: Masahiro Yamada <masahiroy@...nel.org>
Cc: Kris Van Hees <kris.van.hees@...cle.com>, linux-kernel@...r.kernel.org,
        linux-kbuild@...r.kernel.org, linux-modules@...r.kernel.org,
        linux-trace-kernel@...r.kernel.org,
        Steven Rostedt <rostedt@...dmis.org>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Jiri Olsa <olsajiri@...il.com>,
        Elena Zannoni <elena.zannoni@...cle.com>
Subject: Re: [PATCH v2 0/6] Generate address range data for built-in modules

On Mon, May 13, 2024 at 01:43:15PM +0900, Masahiro Yamada wrote:
> On Sun, May 12, 2024 at 7:42???AM Kris Van Hees <kris.van.hees@...cle.com> wrote:
> >
> > Especially for tracing applications, it is convenient to be able to
> > refer to a symbol using a <module name, symbol name> pair and to be able
> > to translate an address into a <nodule mname, symbol name> pair.  But
> > that does not work if the module is built into the kernel because the
> > object files that comprise the built-in module implementation are simply
> > linked into the kernel image along with all other kernel object files.
> >
> > This is especially visible when providing tracing scripts for support
> > purposes, where the developer of the script targets a particular kernel
> > version, but does not have control over whether the target system has
> > a particular module as loadable module or built-in module.  When tracing
> > symbols within a module, referring them by <module name, symbol name>
> > pairs is both convenient and aids symbol lookup.  But that naming will
> > not work if the module name information is lost if the module is built
> > into the kernel on the target system.
> >
> > Earlier work addressing this loss of information for built-in modules
> > involved adding module name information to the kallsyms data, but that
> > required more invasive code in the kernel proper.  This work never did
> > get merged into the kernel tree.
> >
> > All that is really needed is knowing whether a given address belongs to
> > a particular module (or multiple modules if they share an object file).
> > Or in other words, whether that address falls within an address range
> > that is associated with one or more modules.
> >
> > This patch series is baaed on Luis Chamberlain's patch to generate
> > modules.builtin.objs, associating built-in modules with their object
> > files.  Using this data, vmlinux.o.map and vmlinux.map can be parsed in
> > a single pass to generate a modules.buitin.ranges file with offset range
> > information (relative to the base address of the associated section) for
> > built-in modules.  The file gets installed along with the other
> > modules.builtin.* files.
> 
> 
> 
> I still do not want to see modules.builtin.objs.
> 
> 
> During the vmlinux.o.map parse, every time an object path
> is encountered, you can open the corresponding .cmd file.
> 
> 
> 
> Let's say, you have the following in vmlinux.o.map:
> 
> .text          0x00000000007d4fe0     0x46c8 drivers/i2c/i2c-core-base.o
> 
> 
> 
> You can check drivers/i2c/.i2c-core-base.o.cmd
> 
> 
> $ cat drivers/i2c/.i2c-core-base.o.cmd | tr ' ' '\n' | grep KBUILD_MODFILE
> -DKBUILD_MODFILE='"drivers/i2c/i2c-core"'
> 
> 
> Now you know this object is part of drivers/i2c/i2c-core
> (that is, its modname is "i2c-core")
> 
> 
> 
> 
> Next, you will get the following:
> 
>  .text          0x00000000007dc550     0x13c4 drivers/i2c/i2c-core-acpi.o
> 
> 
> $ cat drivers/i2c/.i2c-core-acpi.o.cmd | tr ' ' '\n' | grep KBUILD_MODFILE
> -DKBUILD_MODFILE='"drivers/i2c/i2c-core"'
> 
> 
> This one is also a part of drivers/i2c/i2c-core
>
> 
> You will get the address range of "i2c-core" without changing Makefiles.

Thank you for this suggestion.  I have this approach now implemented, making
use of both KBUILD_MODFILE and KBUILD_MODNAME (both are needed to conclusively
determine that an object belongs to a module).

However, this is not catching objects that are compiled from assembler source,
because modfile_flags and modname_flags are not added to the assembler flags,
and thus KBUILD_MODFILE and KBUILD_MODNAME are not present in the .cmd file
for those objects.

It would seem that it is harmless to add those flags to assembler flags, so
would that be an acceptable solution?  It definitely would provide consistency
with non-asm objects.  And we already pass modfile and modname flags to the
non-asm builds for objects that most certainly do not belong in modules amnyway,
e.g.

$ cat arch/x86/boot/.cmdline.o.cmd| tr ' ' '\n' | grep -- -DKBUILD_MOD
-DKBUILD_MODFILE='"arch/x86/boot/cmdline"'
-DKBUILD_MODNAME='"cmdline"'

> You still need to modify scripts/Makefile.vmlinux(_o)
> but you can implement everything else in your script,
> although I did not fully understand the gawk script.
> 
> 
> Now, you can use Python if you like:
> 
>   https://lore.kernel.org/lkml/20240512-python-version-v2-1-382870a1fa1d@linaro.org/
> 
> Presumably, python code will be more readable for many people.
> 
> 
> GNU awk is not documented in Documentation/process/changes.rst
> If you insist on using gawk, you need to add it to the doc.
> 
> 
> 
> 
> 
> Having said that, I often hope to filter traced functions
> by an object path instead of a modname because modname
> filtering is only useful tristate code.
> For example, filter by "path:drivers/i2c/" or "path:drivers/i2c/i2c-core*"
> rather than "mod:i2c-core"
> 
> <object path, symbol name> reference will be useful for always-builtin code.
> 
> 
> 
> 
> >
> > The impact on the kernel build is minimal because everything is done
> > using a single-pass AWK script.  The generated data size is minimal as
> > well, (depending on the exact kernel configuration) usually in the range
> > of 500-700 lines, with a file size of 20-40KB.
> >
> > Changes since v1:
> >  - Renamed CONFIG_BUILTIN_RANGES to CONFIG_BUILTIN_MODULE_RANGES
> >  - Moved the config option to the tracers section
> >  - 2nd arg to generate_builtin_ranges.awk should be vmlinux.map
> >
> > Kris Van Hees (5):
> >   trace: add CONFIG_BUILTIN_MODULE_RANGES option
> >   kbuild: generate a linker map for vmlinux.o
> >   module: script to generate offset ranges for builtin modules
> >   kbuild: generate modules.builtin.ranges when linking the kernel
> >   module: add install target for modules.builtin.ranges
> >
> > Luis Chamberlain (1):
> >   kbuild: add modules.builtin.objs
> >
> >  .gitignore                          |   2 +-
> >  Documentation/dontdiff              |   2 +-
> >  Documentation/kbuild/kbuild.rst     |   5 ++
> >  Makefile                            |   8 +-
> >  include/linux/module.h              |   4 +-
> >  kernel/trace/Kconfig                |  17 ++++
> >  scripts/Makefile.lib                |   5 +-
> >  scripts/Makefile.modinst            |  11 ++-
> >  scripts/Makefile.vmlinux            |  17 ++++
> >  scripts/Makefile.vmlinux_o          |  18 ++++-
> >  scripts/generate_builtin_ranges.awk | 149 ++++++++++++++++++++++++++++++++++++
> >  11 files changed, 228 insertions(+), 10 deletions(-)
> >  create mode 100755 scripts/generate_builtin_ranges.awk
> >
> >
> > base-commit: dd5a440a31fae6e459c0d6271dddd62825505361
> > --
> > 2.42.0
> >
> >
> 
> 
> -- 
> Best Regards
> Masahiro Yamada

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ