lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YdSI9LmZE+FZAi1K@archlinux-ax161>
Date:   Tue, 4 Jan 2022 10:50:44 -0700
From:   Nathan Chancellor <nathan@...nel.org>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "David S. Miller" <davem@...emloft.net>,
        Ard Biesheuvel <ardb@...nel.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        Al Viro <viro@...iv.linux.org.uk>, llvm@...ts.linux.dev
Subject: Re: [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree
 -v1: Eliminate the Linux kernel's "Dependency Hell"

On Tue, Jan 04, 2022 at 11:47:30AM +0100, Ingo Molnar wrote:
> 
> * Nathan Chancellor <nathan@...nel.org> wrote:
> 
> > Hi Ingo,
> > 
> > On Sun, Jan 02, 2022 at 10:57:35PM +0100, Ingo Molnar wrote:
> > > Before going into details about how this tree solves 'dependency hell' 
> > > exactly, here's the current kernel build performance gain with 
> > > CONFIG_FAST_HEADERS=y enabled, (and with CONFIG_KALLSYMS_FAST=y enabled as 
> > > well - see below), using a stock x86 Linux distribution's .config with all 
> > > modules built into the vmlinux:
> > > 
> > >   #
> > >   # Performance counter stats for 'make -j96 vmlinux' (3 runs):
> > >   #
> > >   # (Elapsed time in seconds):
> > >   #
> > > 
> > >   v5.16-rc7:            231.34 +- 0.60 secs, 15.5 builds/hour    # [ vanilla baseline ]
> > >   -fast-headers-v1:     129.97 +- 0.51 secs, 27.7 builds/hour    # +78.0% improvement
> > 
> > This is really impressive; as someone who constantly builds large
> > kernels for test coverage, I am excited about less time to get results.
> > Testing on an 80-core arm64 server (the fastest machine I have access to
> > at the moment) with LLVM, I can see anywhere from 18% to 35% improvement.
> > 
> > 
> > Benchmark 1: ARCH=arm64 defconfig (linux)
> >   Time (mean ± σ):     97.159 s ±  0.246 s    [User: 4828.383 s, System: 611.256 s]
> >   Range (min … max):   96.900 s … 97.648 s    10 runs
> > 
> > Benchmark 2: ARCH=arm64 defconfig (linux-fast-headers)
> >   Time (mean ± σ):     76.300 s ±  0.107 s    [User: 3149.986 s, System: 436.487 s]
> >   Range (min … max):   76.117 s … 76.467 s    10 runs
> 
> That looks good, thanks for giving it a test, and thanks for all the fixes! 
> :-)
> 
> Note that on ARM64 the elapsed time improvement is 'only' 18-35%, because 
> the triple-linking of vmlinux serializes much of the of a build & ARM64 
> doesn't have the kallsyms-objtool feature yet.
> 
> But we can already see how much faster it became, from the user+system time 
> spent building the kernel:
> 
>            vanilla: 4828.383 s + 611.256 s = 5439.639 s
>   -fast-headers-v1: 3149.986 s + 436.487 s = 3586.473 s
> 
> That's a +51% speedup. :-)
D> 
> With CONFIG_KALLSYMS_FAST=y on x86, the final link gets faster by about 
> 60%-70%, so the header improvements will more directly show up in elapsed 
> time as well.
> 
> Plus I spent more time looking at x86 header bloat than at ARM64 header 
> bloat. In the end I think the improvement could probably moved into the 
> broad 60-70% range that I see on x86.
> 
> All the other ARM64 tests show a 37%-43% improvement in CPU time used:
> 
> > Benchmark 1: ARCH=arm64 allmodconfig (linux)
> >   Time (mean ± σ):     390.106 s ±  0.192 s    [User: 23893.382 s, System: 2802.413 s]
> >   Range (min … max):   389.942 s … 390.513 s    7 runs
> > 
> > Benchmark 2: ARCH=arm64 allmodconfig (linux-fast-headers)
> >   Time (mean ± σ):     288.066 s ±  0.621 s    [User: 16436.098 s, System: 2117.352 s]
> >   Range (min … max):   287.131 s … 288.982 s    7 runs
> 
> # (23893.382+2802.413)/(16436.098+2117.352) = +43% in throughput.
> 
> 
> > Benchmark 1: ARCH=arm64 allyesconfig (linux)
> >   Time (mean ± σ):     557.752 s ±  1.019 s    [User: 21227.404 s, System: 2226.121 s]
> >   Range (min … max):   555.833 s … 558.775 s    7 runs
> > 
> > Benchmark 2: ARCH=arm64 allyesconfig (linux-fast-headers)
> >   Time (mean ± σ):     473.815 s ±  1.793 s    [User: 15351.991 s, System: 1689.630 s]
> >   Range (min … max):   471.542 s … 476.830 s    7 runs
> 
> # (21227.404+2226.121)/(15351.991+1689.630) = +37%
> 
> 
> > Benchmark 1: ARCH=x86_64 defconfig (linux)
> >   Time (mean ± σ):     41.122 s ±  0.190 s    [User: 1700.206 s, System: 205.555 s]
> >   Range (min … max):   40.966 s … 41.515 s    7 runs
> > 
> > Benchmark 2: ARCH=x86_64 defconfig (linux-fast-headers)
> >   Time (mean ± σ):     36.357 s ±  0.183 s    [User: 1134.252 s, System: 152.396 s]
> >   Range (min … max):   35.983 s … 36.534 s    7 runs
> 
> 
> # (1700.206+205.555)/(1134.252+152.396) = +48%
> 
> > Summary
> >   'ARCH=x86_64 defconfig (linux-fast-headers)' ran
> >     1.13 ± 0.01 times faster than 'ARCH=x86_64 defconfig (linux)'
> 
> Now this x86-defconfig result you got is a bit weird - it *should* have 
> been around ~50% faster on x86 in terms of elapsed time too.
> 
> Here's how x86-64 defconfig looks like on my system - with 128 GB RAM & 
> fast NVDIMMs and 64 CPUs:
> 
>    #
>    # -v5.16-rc8:
>    #
> 
>    $ perf stat --repeat 3 -e instructions,cycles,cpu-clock --sync --pre "make clean >/dev/null" make -j96 vmlinux >/dev/null
> 
>    Performance counter stats for 'make -j96 vmlinux' (3 runs):
> 
>    4,906,953,379,372      instructions              #    0.90  insn per cycle           ( +-  0.00% )
>    5,475,163,448,391      cycles                    #    3.898 GHz                      ( +-  0.01% )
>         1,404,614.64 msec cpu-clock                 #   45.864 CPUs utilized            ( +-  0.01% )
> 
>              30.6258 +- 0.0337 seconds time elapsed  ( +-  0.11% )
> 
>    #
>    # -fast-headers-v1:
>    #
> 
>    $ make defconfig
>    $ grep KALLSYMS_FAST .config
>    CONFIG_KALLSYMS_FAST=y
> 
>    $ perf stat --repeat 3 -e instructions,cycles,cpu-clock --sync --pre "make clean >/dev/null" make -j96 vmlinux >/dev/null
> 
>     Performance counter stats for 'make -j96 vmlinux' (3 runs):
> 
>      3,500,079,269,120      instructions              #    0.90  insn per cycle           ( +-  0.00% )
>      3,872,081,278,824      cycles                    #    3.895 GHz                      ( +-  0.10% )
>             993,448.13 msec cpu-clock                 #   47.306 CPUs utilized            ( +-  0.10% )
> 
>              21.0004 +- 0.0265 seconds time elapsed  ( +-  0.13% )
> 
> That's a +45.8% speedup in elapsed time, and a +41.4% improvement in 
> cpu-clock utilization.
> 
> I'm wondering whether your system has some sort of bottleneck?

Yes, it is entirely possible. That testing was done on Equinix's
c3.large.arm server and I have noticed at times that single threaded
tasks seems to take a little bit longer than on my x86_64 box.

https://metal.equinix.com/product/servers/c3-large-arm/

The all{mod,yes}config tests on that box had a much more noticeable
improvement, along the lines of what you were expecting:


Benchmark 1: ARCH=x86_64 allmodconfig (linux)
  Time (mean ± σ):     387.575 s ±  0.288 s    [User: 23916.296 s, System: 2814.850 s]
  Range (min … max):   387.252 s … 388.295 s    10 runs

Benchmark 2: ARCH=x86_64 allmodconfig (linux-fast-headers)
  Time (mean ± σ):     255.934 s ±  0.972 s    [User: 15130.494 s, System: 2095.091 s]
  Range (min … max):   254.655 s … 257.357 s    10 runs

Summary
  'ARCH=x86_64 allmodconfig (linux-fast-headers)' ran
    1.51 ± 0.01 times faster than 'ARCH=x86_64 allmodconfig (linux)'

# (23916.296+2814.850)/(15130.494+2095.091) = +55.18%


Benchmark 1: ARCH=x86_64 allyesconfig (linux)
  Time (mean ± σ):     568.027 s ±  1.071 s    [User: 21985.096 s, System: 2357.516 s]
  Range (min … max):   566.769 s … 569.801 s    10 runs

Benchmark 2: ARCH=x86_64 allyesconfig (linux-fast-headers)
  Time (mean ± σ):     381.248 s ±  0.919 s    [User: 14916.766 s, System: 1728.218 s]
  Range (min … max):   379.746 s … 382.852 s    10 runs

Summary
  'ARCH=x86_64 allyesconfig (linux-fast-headers)' ran
    1.49 ± 0.00 times faster than 'ARCH=x86_64 allyesconfig (linux)'

# (21985.096+2357.516)/(14916.766+1728.218) = +46.25%

> One thing I do though when running benchmarks is to switch the cpufreq 
> governor to 'performance', via something like:
> 
>    NR_CPUS=$(nproc --all)
> 
>    curr=$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor)
>    next=performance
> 
>    echo "# setting all $NR_CPUS CPUs from '"$curr"' to the '"$next"' governor"
> 
>    for ((cpu=0; cpu<$NR_CPUS; cpu++)); do
>      G=/sys/devices/system/cpu/cpu$cpu/cpufreq/scaling_governor
>      [ -f $G ] && echo $next > $G
>    done
> 
> This minimizes the amount of noise across iterations and makes the results 
> more dependable:
> 
>              30.6258 +- 0.0337 seconds time elapsed  ( +-  0.11% )
>              21.0004 +- 0.0265 seconds time elapsed  ( +-  0.13% )

Good point. With my main box (AMD EPYC 7502P), with the performance governor...

GCC:

Benchmark 1: ARCH=x86_64 defconfig (linux)
  Time (mean ± σ):     48.685 s ±  0.049 s    [User: 1969.835 s, System: 204.166 s]
  Range (min … max):   48.620 s … 48.782 s    10 runs

Benchmark 2: ARCH=x86_64 defconfig (linux-fast-headers)
  Time (mean ± σ):     46.797 s ±  0.119 s    [User: 1403.854 s, System: 154.336 s]
  Range (min … max):   46.620 s … 47.052 s    10 runs

Summary
  'ARCH=x86_64 defconfig (linux-fast-headers)' ran
    1.04 ± 0.00 times faster than 'ARCH=x86_64 defconfig (linux)'

LLVM:

Benchmark 1: ARCH=x86_64 defconfig (linux)
  Time (mean ± σ):     51.816 s ±  0.079 s    [User: 2208.577 s, System: 200.410 s]
  Range (min … max):   51.671 s … 51.900 s    10 runs

Benchmark 2: ARCH=x86_64 defconfig (linux-fast-headers)
  Time (mean ± σ):     46.806 s ±  0.062 s    [User: 1438.972 s, System: 154.846 s]
  Range (min … max):   46.696 s … 46.917 s    10 runs

Summary
  'ARCH=x86_64 defconfig (linux-fast-headers)' ran
    1.11 ± 0.00 times faster than 'ARCH=x86_64 defconfig (linux)'

$ rg KALLSYMS .config
246:CONFIG_KALLSYMS=y
247:# CONFIG_KALLSYMS_ALL is not set
248:CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
249:CONFIG_KALLSYMS_BASE_RELATIVE=y
250:CONFIG_KALLSYMS_FAST=y
706:CONFIG_HAVE_OBJTOOL_KALLSYMS=y

It seems like everything is working right but maybe the build is so
short that there just is not much time for the difference to be as
apparent?

> > > With the fast-headers kernel that's down to ~36,000 lines of code, 
> > > almost a factor of 3 reduction:
> > > 
> > >   # fast-headers-v1:
> > >   kepler:~/mingo.tip.git> wc -l kernel/pid.i
> > >   35941 kernel/pid.i
> > 
> > Coming from someone who often has to reduce a preprocessed kernel source 
> > file with creduce/cvise to report compiler bugs, this will be a very 
> > welcomed change, as those tools will have to do less work, and I can get 
> > my reports done faster.
> 
> That's nice, didn't think of that side effect.
> 
> Could you perhaps measure this too, to see how much of a benefit it is?

Yes, next time that I run into a bug that I have to use those tools on,
I will see if I can benchmark the difference!

> > ########################################################################
> > 
> > I took the series for a spin with clang and GCC on arm64 and x86_64 and
> > I found a few warnings/errors.
> 
> Thank you!
> 
> > 1. Position of certain attributes
> > 
> > In some commits, you move the cacheline_aligned attributes from after
> > the closing brace on structures to before the struct keyword, which
> > causes clang to warn (and error with CONFIG_WERROR):
> > 
> > In file included from arch/arm64/kernel/asm-offsets.c:9:
> > In file included from arch/arm64/kernel/../../../kernel/sched/per_task_area_struct.h:33:
> > In file included from ./include/linux/perf_event_api.h:17:
> > In file included from ./include/linux/perf_event_types.h:41:
> > In file included from ./include/linux/ftrace.h:18:
> > In file included from ./arch/arm64/include/asm/ftrace.h:53:
> > In file included from ./include/linux/compat.h:11:
> > ./include/linux/fs_types.h:997:1: error: attribute '__aligned__' is ignored, place it after "struct" to apply attribute to type declaration [-Werror,-Wignored-attributes]
> > ____cacheline_aligned
> > ^
> > ./include/linux/cache.h:41:46: note: expanded from macro '____cacheline_aligned'
> > #define ____cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES)))
> 
> Yeah, so this is a *really* stupid warning from Clang.
> 
> Putting the attribute after 'struct' risks the hard to track down bugs when 
> a <linux/cache.h> inclusion is missing, which scenario I pointed out in 
> this commit:
> 
>     headers/deps: dcache: Move the ____cacheline_aligned attribute to the head of the definition
>     
>     When changing <linux/dcache.h> I removed the <linux/spinlock_api.h> header,
>     which caused a couple of hundred of mysterious, somewhat obscure link time errors:
>     
>       ld: net/sctp/tsnmap.o:(.bss+0x0): multiple definition of `____cacheline_aligned_in_smp'; init/do_mounts_rd.o:(.bss+0x0): first defined here
>       ld: net/sctp/tsnmap.o:(.bss+0x40): multiple definition of `____cacheline_aligned'; init/do_mounts_rd.o:(.bss+0x40): first defined here
>       ld: net/sctp/debug.o:(.bss+0x0): multiple definition of `____cacheline_aligned_in_smp'; init/do_mounts_rd.o:(.bss+0x0): first defined here
>       ld: net/sctp/debug.o:(.bss+0x40): multiple definition of `____cacheline_aligned'; init/do_mounts_rd.o:(.bss+0x40): first defined here
>     
>     After a bit of head-scratching, what happened is that 'struct dentry_operations'
>     has the ____cacheline_aligned attribute at the tail of the type definition -
>     which turned into a local variable definition when <linux/cache.h> was not
>     included - which <linux/spinlock_api.h> includes into <linux/dcache.h> indirectly.
>     
>     There were no compile time errors, only link time errors.
>     
>     Move the attribute to the head of the definition, in which case
>     a missing <linux/cache.h> inclusion creates an immediate build failure:
>     
>       In file included from ./include/linux/fs.h:9,
>                        from ./include/linux/fsverity.h:14,
>                        from fs/verity/fsverity_private.h:18,
>                        from fs/verity/read_metadata.c:8:
>       ./include/linux/dcache.h:132:22: error: expected ‘;’ before ‘struct’
>         132 | ____cacheline_aligned
>             |                      ^
>             |                      ;
>         133 | struct dentry_operations {
>             | ~~~~~~
>     
>     No change in functionality.
>     
>     Signed-off-by: Ingo Molnar <mingo@...nel.org>
> 
> Can this Clang warning be disabled?

I'll comment on this in the other thread.

> > 2. Error with CONFIG_SHADOW_CALL_STACK
> 
> So this feature depends on Clang:
> 
>  # Supported by clang >= 7.0
>  config CC_HAVE_SHADOW_CALL_STACK
>          def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
> 
> No way to activate it under my GCC cross-build toolchain, right?
> 
> But ... I hacked the build mode on with GCC using this patch:
> 
> From: Ingo Molnar <mingo@...nel.org>
> Date: Tue, 4 Jan 2022 11:26:09 +0100
> Subject: [PATCH] DO NOT MERGE: Enable SHADOW_CALL_STACK on GCC builds, for build testing
> 
> NOT-Signed-off-by: Ingo Molnar <mingo@...nel.org>
> ---
>  Makefile           | 2 +-
>  arch/Kconfig       | 2 +-
>  arch/arm64/Kconfig | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 16d7f83ac368..bbab462e7509 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -888,7 +888,7 @@ LDFLAGS_vmlinux += --gc-sections
>  endif
>  
>  ifdef CONFIG_SHADOW_CALL_STACK
> -CC_FLAGS_SCS	:= -fsanitize=shadow-call-stack
> +CC_FLAGS_SCS	:=
>  KBUILD_CFLAGS	+= $(CC_FLAGS_SCS)
>  export CC_FLAGS_SCS
>  endif
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 4e56f66fdbcf..2103d9da4fe1 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -605,7 +605,7 @@ config ARCH_SUPPORTS_SHADOW_CALL_STACK
>  
>  config SHADOW_CALL_STACK
>  	bool "Clang Shadow Call Stack"
> -	depends on CC_IS_CLANG && ARCH_SUPPORTS_SHADOW_CALL_STACK
> +	depends on ARCH_SUPPORTS_SHADOW_CALL_STACK
>  	depends on DYNAMIC_FTRACE_WITH_REGS || !FUNCTION_GRAPH_TRACER
>  	help
>  	  This option enables Clang's Shadow Call Stack, which uses a
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index c4207cf9bb17..952f3e56e0a7 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1183,7 +1183,7 @@ config ARCH_HAS_FILTER_PGPROT
>  
>  # Supported by clang >= 7.0
>  config CC_HAVE_SHADOW_CALL_STACK
> -	def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
> +	def_bool y
>  
>  config PARAVIRT
>  	bool "Enable paravirtualization code"
> 
> 
> And was able to trigger at least some of the build errors you saw:
> 
>   In file included from kernel/scs.c:15:
>   ./include/linux/scs.h: In function 'scs_task_reset':
>   ./include/linux/scs.h:26:34: error: implicit declaration of function 'task_thread_info' [-Werror=implicit-function-declaration]
> 
> This is fixed with:
> 
> diff --git a/kernel/scs.c b/kernel/scs.c
> index ca9e707049cb..719ab53adc8a 100644
> --- a/kernel/scs.c
> +++ b/kernel/scs.c
> @@ -5,6 +5,7 @@
>   * Copyright (C) 2019 Google LLC
>   */
>  
> +#include <linux/sched/thread_info_api.h>
>  #include <linux/sched.h>
>  #include <linux/mm_page_address.h>
>  #include <linux/mm_api.h>
> 
> 
> Then there's the build failure in init/main.c:
> 
> > It looks like on mainline, init_shadow_call_stack is in defined and used 
> > in init/init_task.c but now, it is used in init/main.c, with no
> > declaration to allow the compiler to find the definition. I guess moving
> > init_shadow_call_stack out of init/init_task.c to somewhere more common
> > would fix this but it depends on SCS_SIZE, which is defined in
> > include/linux/scs.h, and as soon as I tried to include that in another
> > file, the build broke further... Any ideas you have would be appreciated
> > :) for benchmarking purposes, I just disabled CONFIG_SHADOW_CALL_STACK.
> 
> So I see:
> 
> In file included from ./include/linux/thread_info.h:63,
>                  from ./arch/arm64/include/asm/smp.h:32,
>                  from ./include/linux/smp_api.h:15,
>                  from ./include/linux/percpu.h:6,
>                  from ./include/linux/softirq.h:8,
>                  from init/main.c:17:
> init/main.c: In function 'init_per_task_early':
> ./arch/arm64/include/asm/thread_info.h:113:27: error: 'init_shadow_call_stack' undeclared (first use in this function)
>   113 |         .scs_base       = init_shadow_call_stack,                       \
>       |                           ^~~~~~~~~~~~~~~~~~~~~~
> 
> This looks pretty straightforward, does this patch solve it?
> 
>  include/linux/scs.h | 3 +++
>  init/main.c         | 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/include/linux/scs.h b/include/linux/scs.h
> index 18122d9e17ff..863932a9347a 100644
> --- a/include/linux/scs.h
> +++ b/include/linux/scs.h
> @@ -8,6 +8,7 @@
>  #ifndef _LINUX_SCS_H
>  #define _LINUX_SCS_H
>  
> +#include <linux/sched/thread_info_api.h>
>  #include <linux/gfp.h>
>  #include <linux/poison.h>
>  #include <linux/sched.h>
> @@ -25,6 +26,8 @@
>  #define task_scs(tsk)		(task_thread_info(tsk)->scs_base)
>  #define task_scs_sp(tsk)	(task_thread_info(tsk)->scs_sp)
>  
> +extern unsigned long init_shadow_call_stack[SCS_SIZE / sizeof(long)];
> +
>  void *scs_alloc(int node);
>  void scs_free(void *s);
>  void scs_init(void);
> diff --git a/init/main.c b/init/main.c
> index c9eb3ecbe18c..74ccad445009 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -12,6 +12,7 @@
>  
>  #define DEBUG		/* Enable initcall_debug */
>  
> +#include <linux/scs.h>
>  #include <linux/workqueue_api.h>
>  #include <linux/sysctl.h>
>  #include <linux/softirq.h>
> 
> I've applied these fixes, with that CONFIG_SHADOW_CALL_STACK=y builds fine 
> on ARM64 - but I performed no runtime testing.
> 
> I've backmerged this into:
> 
>     headers/deps: per_task, arm64, x86: Convert task_struct::thread to a per_task() field
> 
> where this bug originated from.
> 
> I.e. I think the bug was simply to make main.c aware of the array, now that 
> the INIT_THREAD initialization is done there.

Yes, that seems right.

Unfortunately, while the kernel now builds, it does not boot in QEMU. I
tried to checkout at 9006a48618cc0cacd3f59ff053e6509a9af5cc18 to see if
I could reproduce that breakage there but the build errors out at that
change (I do see notes of bisection breakage in some of the commits) so
I assume that is expected.

There is no output, even with earlycon, so it seems like something is
going wrong in early boot code. I am not very familiar with the SCS code
so I will see if I can debug this with gdb later (I'll try to see if it
is reproducible with GCC as well; as Nick mentions, there is support
being added to it and I don't mind building from source).

> We could move over the init_shadow_call_stack[] array there and make it 
> static to begin with? I don't think anything truly relies on it being a 
> global symbol.

That is what I thought as well... I'll see if I can ping Sami to see if
there is any reason not to do that.

> > 3. Nested function in arch/x86/kernel/asm-offsets.c
> 
> > diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> > index ff3f8ed5d0a2..a6d56f4697cd 100644
> > --- a/arch/x86/kernel/asm-offsets.c
> > +++ b/arch/x86/kernel/asm-offsets.c
> > @@ -35,10 +35,10 @@
> >  # include "asm-offsets_64.c"
> >  #endif
> > 
> > -static void __used common(void)
> > -{
> >  #include "../../../kernel/sched/per_task_area_struct_defs.h"
> > 
> > +static void __used common(void)
> > +{
> >         BLANK();
> >         DEFINE(TASK_threadsp, offsetof(struct task_struct, per_task_area) +
> >                               offsetof(struct task_struct_per_task, thread) +
> 
> Ha, that code is bogus, it's a merge bug of mine. Super interesting that 
> GCC still managed to include the header ...
> 
> I've applied your fix.
> 
> > 4. Build error in kernel/gcov/clang.c
> 
> > 8 errors generated.
> > 
> > I resolved this with:
> > 
> > diff --git a/kernel/gcov/clang.c b/kernel/gcov/clang.c
> > index 6ee385f6ad47..29f0899ba209 100644
> > --- a/kernel/gcov/clang.c
> > +++ b/kernel/gcov/clang.c
> > @@ -52,6 +52,7 @@
> >  #include <linux/ratelimit.h>
> >  #include <linux/slab.h>
> >  #include <linux/mm.h>
> > +#include <linux/string.h>
> >  #include "gcov.h"
> 
> Thank you - applied!
> 
> >  typedef void (*llvm_gcov_callback)(void);
> > 
> > 
> > 5. BPF errors
> > 
> > With Arch Linux's config (https://github.com/archlinux/svntogit-packages/raw/packages/linux/trunk/config),
> > I see the following errors:
> > 
> > kernel/bpf/preload/iterators/iterators.c:3:10: fatal error: 'linux/sched/signal.h' file not found
> > #include <linux/sched/signal.h>
> >          ^~~~~~~~~~~~~~~~~~~~~~
> > 1 error generated.
> > 
> > kernel/bpf/sysfs_btf.c:21:2: error: implicitly declaring library function 'memcpy' with type 'void *(void *, const void *, unsigned long)' [-Werror,-Wimplicit-function-declaration]
> >         memcpy(buf, __start_BTF + off, len);
> >         ^
> > kernel/bpf/sysfs_btf.c:21:2: note: include the header <string.h> or explicitly provide a declaration for 'memcpy'
> > 1 error generated.
> > 
> > The second error is obviously fixed by just including string.h as above.
> 
> Applied.
> 
> > I am not sure what is wrong with the first one; the includes all appear
> > to be userland headers, rather than kernel ones, so maybe an -I flag is
> > not present that should be? To work around it, I disabled
> > CONFIG_BPF_PRELOAD.
> 
> Yeah, this should be fixed by simply removing the two stray dependencies 
> that found their way into this user-space code:
> 
>  kernel/bpf/preload/iterators/iterators.bpf.c | 1 -
>  kernel/bpf/preload/iterators/iterators.c     | 1 -
>  2 files changed, 2 deletions(-)
> 
> diff --git a/kernel/bpf/preload/iterators/iterators.bpf.c b/kernel/bpf/preload/iterators/iterators.bpf.c
> index 41ae00edeecf..03af863314ea 100644
> --- a/kernel/bpf/preload/iterators/iterators.bpf.c
> +++ b/kernel/bpf/preload/iterators/iterators.bpf.c
> @@ -1,6 +1,5 @@
>  // SPDX-License-Identifier: GPL-2.0
>  /* Copyright (c) 2020 Facebook */
> -#include <linux/seq_file.h>
>  #include <linux/bpf.h>
>  #include <bpf/bpf_helpers.h>
>  #include <bpf/bpf_core_read.h>
> diff --git a/kernel/bpf/preload/iterators/iterators.c b/kernel/bpf/preload/iterators/iterators.c
> index d702cbf7ddaf..5d872a705470 100644
> --- a/kernel/bpf/preload/iterators/iterators.c
> +++ b/kernel/bpf/preload/iterators/iterators.c
> @@ -1,6 +1,5 @@
>  // SPDX-License-Identifier: GPL-2.0
>  /* Copyright (c) 2020 Facebook */
> -#include <linux/sched/signal.h>
>  #include <errno.h>
>  #include <stdio.h>
>  #include <stdlib.h>

Yes, that resolves the error for me.

> > 6. resolve_btfids warning
> > 
> > After working around the above errors, with either GCC or clang, I see
> > the following warnings with Arch Linux's configuration:
> > 
> > WARN: multiple IDs found for 'task_struct': 103, 23549 - using 103
> > WARN: multiple IDs found for 'path': 1166, 23551 - using 1166
> > WARN: multiple IDs found for 'inode': 997, 23561 - using 997
> > WARN: multiple IDs found for 'file': 714, 23566 - using 714
> > WARN: multiple IDs found for 'seq_file': 1120, 23673 - using 1120
> > 
> > Which appears to come from symbols_resolve() in
> > tools/bpf/resolve_btfids/main.c.
> 
> Hm, is this perhaps related to CONFIG_KALLSYMS_FAST=y? If yes then turning 
> it off might help.
> 
> I don't really know this area of BPF all that much, maybe someone else can 
> see what the problem is? The error message is not self-explanatory.

It does not seem related, as I disabled that configuration and still see
it.

I am equally ignorant about BPF so enlisting their help would good.

> > 
> > ########################################################################
> > 
> > I am very excited to see where this goes, it is a herculean effort but I
> > think it will be worth it in the long run. Let me know if there is any
> > more information or input that I can provide, cheers!
> 
> Your testing & patch sending efforts are much appreciated!! You'd help me 
> most by continuing on the same path with new fast-headers releases as well, 
> whenever you find the time. :-)
> 
> BTW., you can always pick up my latest Work-In-Progress branch from:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/mingo/tip.git sched/headers
> 
> The 'master' branch will carry the release.
> 
> The sched/headers branch is already rebased to -rc8 and has some other 
> changes as well. It should normally work, with less testing than the main 
> releasees, but will at times have fixes at the tail waiting to be 
> backmerged in a bisect-friendly way.

Sure thing, I will continue to follow this and test it as much as I can
to make sure everything continues to work well!

Cheers,
Nathan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ