linux-kernel - Re: [PATCH v3] selftests: add new kallsyms selftests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuHMdVG3Z63BruhrnQtSadCnaKZ+hpwFDJDnitXST8fRNYoLQ@mail.gmail.com>
Date: Thu, 28 Nov 2024 15:10:34 +0100
From: Geert Uytterhoeven <geert@...ux-m68k.org>
To: Luis Chamberlain <mcgrof@...nel.org>
Cc: linux-modules@...r.kernel.org, linux-kernel@...r.kernel.org, 
	petr.pavlu@...e.com, samitolvanen@...gle.com, da.gomez@...sung.com, 
	masahiroy@...nel.org, deller@....de, linux-arch@...r.kernel.org, 
	live-patching@...r.kernel.org, kris.van.hees@...cle.com
Subject: Re: [PATCH v3] selftests: add new kallsyms selftests

Hi Luis,

On Mon, Oct 21, 2024 at 9:33 PM Luis Chamberlain <mcgrof@...nel.org> wrote:
> We lack find_symbol() selftests, so add one. This let's us stress test
> improvements easily on find_symbol() or optimizations. It also inherently
> allows us to test the limits of kallsyms on Linux today.
>
> We test a pathalogical use case for kallsyms by introducing modules
> which are automatically written for us with a larger number of symbols.
> We have 4 kallsyms test modules:
>
> A: has KALLSYSMS_NUMSYMS exported symbols
> B: uses one of A's symbols
> C: adds KALLSYMS_SCALE_FACTOR * KALLSYSMS_NUMSYMS exported
> D: adds 2 * the symbols than C
>
> By using anything much larger than KALLSYSMS_NUMSYMS as 10,000 and
> KALLSYMS_SCALE_FACTOR of 8 we segfault today. So we're capped at
> around 160000 symbols somehow today. We can inpsect that issue at
> our leasure later, but for now the real value to this test is that
> this will easily allow us to test improvements on find_symbol().
>
> We want to enable this test on allyesmodconfig builds so we can't
> use this combination, so instead just use a safe value for now and
> be informative on the Kconfig symbol documentation about where our
> thresholds are for testers. We default then to KALLSYSMS_NUMSYMS of
> just 100 and KALLSYMS_SCALE_FACTOR of 8.
>
> On x86_64 we can use perf, for other architectures we just use 'time'
> and allow for customizations. For example a future enhancements could
> be done for parisc to check for unaligned accesses which triggers a
> special special exception handler assembler code inside the kernel.
> The negative impact on performance is so large on parisc that it
> keeps track of its accesses on /proc/cpuinfo as UAH:
>
> IRQ:       CPU0       CPU1
> 3:       1332          0         SuperIO  ttyS0
> 7:    1270013          0         SuperIO  pata_ns87415
> 64:  320023012  320021431             CPU  timer
> 65:   17080507   20624423             CPU  IPI
> UAH:   10948640      58104   Unaligned access handler traps
>
> While at it, this tidies up lib/ test modules to allow us to have
> a new directory for them. The amount of test modules under lib/
> is insane.
>
> This should also hopefully showcase how to start doing basic
> self module writing code, which may be more useful for more complex
> cases later in the future.
>
> Signed-off-by: Luis Chamberlain <mcgrof@...nel.org>

Thanks for your patch, which is now commit 84b4a51fce4ccc66
("selftests: add new kallsyms selftests") upstream.

> @@ -2903,6 +2903,111 @@ config TEST_KMOD
>
>           If unsure, say N.
>
> +config TEST_RUNTIME
> +       bool
> +
> +config TEST_RUNTIME_MODULE
> +       bool
> +
> +config TEST_KALLSYMS
> +       tristate "module kallsyms find_symbol() test"
> +       depends on m
> +       select TEST_RUNTIME
> +       select TEST_RUNTIME_MODULE
> +       select TEST_KALLSYMS_A
> +       select TEST_KALLSYMS_B
> +       select TEST_KALLSYMS_C
> +       select TEST_KALLSYMS_D
> +       help
> +         This allows us to stress test find_symbol() through the kallsyms
> +         used to place symbols on the kernel ELF kallsyms and modules kallsyms
> +         where we place kernel symbols such as exported symbols.
> +
> +         We have four test modules:
> +
> +         A: has KALLSYSMS_NUMSYMS exported symbols
> +         B: uses one of A's symbols
> +         C: adds KALLSYMS_SCALE_FACTOR * KALLSYSMS_NUMSYMS exported
> +         D: adds 2 * the symbols than C
> +
> +         We stress test find_symbol() through two means:
> +
> +         1) Upon load of B it will trigger simplify_symbols() to look for the
> +         one symbol it uses from the module A with tons of symbols. This is an
> +         indirect way for us to have B call resolve_symbol_wait() upon module
> +         load. This will eventually call find_symbol() which will eventually
> +         try to find the symbols used with find_exported_symbol_in_section().
> +         find_exported_symbol_in_section() uses bsearch() so a binary search
> +         for each symbol. Binary search will at worst be O(log(n)) so the
> +         larger TEST_MODULE_KALLSYSMS the worse the search.
> +
> +         2) The selftests should load C first, before B. Upon B's load towards
> +         the end right before we call module B's init routine we get
> +         complete_formation() called on the module. That will first check
> +         for duplicate symbols with the call to verify_exported_symbols().
> +         That is when we'll force iteration on module C's insane symbol list.
> +         Since it has 10 * KALLSYMS_NUMSYMS it means we can first test
> +         just loading B without C. The amount of time it takes to load C Vs
> +         B can give us an idea of the impact growth of the symbol space and
> +         give us projection. Module A only uses one symbol from B so to allow
> +         this scaling in module C to be proportional, if it used more symbols
> +         then the first test would be doing more and increasing just the
> +         search space would be slightly different. The last module, module D
> +         will just increase the search space by twice the number of symbols in
> +         C so to allow for full projects.
> +
> +         tools/testing/selftests/module/find_symbol.sh
> +
> +         The current defaults will incur a build delay of about 7 minutes
> +         on an x86_64 with only 8 cores. Enable this only if you want to
> +         stress test find_symbol() with thousands of symbols. At the same
> +         time this is also useful to test building modules with thousands of
> +         symbols, and if BTF is enabled this also stress tests adding BTF
> +         information for each module. Currently enabling many more symbols
> +         will segfault the build system.

Despite the warning, I gave this a try on m68k (cross-compiled on i7 ;-).
However, I didn't notice any extra-ordinary build times.

Also, when running the test manually on ARAnyM, everything runs
in the blink of an eye.  I didn't use the script, but ran all commands
manually.  I tried insmodding a/b/c/d, c/a/b, a/c/d/b.

Is this expected?
Thanks!

$ wc -l lib/tests/module/test_kallsyms_*.c
   233 lib/tests/module/test_kallsyms_a.c
    22 lib/tests/module/test_kallsyms_a.mod.c
    35 lib/tests/module/test_kallsyms_b.c
    21 lib/tests/module/test_kallsyms_b.mod.c
  1633 lib/tests/module/test_kallsyms_c.c
    21 lib/tests/module/test_kallsyms_c.mod.c
  3233 lib/tests/module/test_kallsyms_d.c
    21 lib/tests/module/test_kallsyms_d.mod.c
  5219 total

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds