lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241003232938.GA1663252@thelio-3990X>
Date: Thu, 3 Oct 2024 16:29:38 -0700
From: Nathan Chancellor <nathan@...nel.org>
To: Wentao Zhang <wentaoz5@...inois.edu>
Cc: Matt.Kelly2@...ing.com, akpm@...ux-foundation.org,
	andrew.j.oppelt@...ing.com, anton.ivanov@...bridgegreys.com,
	ardb@...nel.org, arnd@...db.de, bhelgaas@...gle.com, bp@...en8.de,
	chuck.wolber@...ing.com, dave.hansen@...ux.intel.com,
	dvyukov@...gle.com, hpa@...or.com, jinghao7@...inois.edu,
	johannes@...solutions.net, jpoimboe@...nel.org,
	justinstitt@...gle.com, kees@...nel.org, kent.overstreet@...ux.dev,
	linux-arch@...r.kernel.org, linux-efi@...r.kernel.org,
	linux-kbuild@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-trace-kernel@...r.kernel.org, linux-um@...ts.infradead.org,
	llvm@...ts.linux.dev, luto@...nel.org, marinov@...inois.edu,
	masahiroy@...nel.org, maskray@...gle.com,
	mathieu.desnoyers@...icios.com, matthew.l.weber3@...ing.com,
	mhiramat@...nel.org, mingo@...hat.com, morbo@...gle.com,
	ndesaulniers@...gle.com, oberpar@...ux.ibm.com, paulmck@...nel.org,
	peterz@...radead.org, richard@....at, rostedt@...dmis.org,
	samitolvanen@...gle.com, samuel.sarkisian@...ing.com,
	steven.h.vanderleest@...ing.com, tglx@...utronix.de,
	tingxur@...inois.edu, tyxu@...inois.edu, x86@...nel.org
Subject: Re: [PATCH v2 0/4] Enable measuring the kernel's Source-based Code
 Coverage and MC/DC with Clang

Hi Wentao,

On Wed, Oct 02, 2024 at 01:42:52AM -0500, Wentao Zhang wrote:
> Thanks for all the comments!

You're welcome, sorry it took me some time to give them initially.

> On 2024-10-01 23:53, Nathan Chancellor wrote:
> > I took this series for a spin on next-20241001 with LLVM 19.1.0 using a
> > distribution configuration tailored for a local development VM using
> > QEMU. You'll notice on the rebase for 6.12-rc1 but there is a small
> > conflict in kernel/Makefile due to commit 0e8b67982b48 ("mm: move
> > kernel/numa.c to mm/").
> >
> > I initially did the build on one of my test machines which has 16
> > threads with 32GB of RAM and ld.lld got killed while linking vmlinux.o.
> > Is your comment in the MC/DC patch "more memory is consumed if larger
> > decisions are getting counted" relevant here or is that talking about
> > runtime memory on the target device? I assume the latter but I figured I
> 
> Yes the build process (linking particularly) is quite memory-intensive if
> the whole kernel is instrumented with source-based code coverage, no matter
> it's with or without MC/DC. What you've observed is expected. (Although the
> quoted message was referring to runtime overhead)

Okay, thanks for clarifying!

> On the last slide of [8] I had some earlier data regarding full-kernel
> build- and run-time overhead. In our GitHub Actions builds [9], I have
> been keeping track of "/usr/bin/time -v make ..." output and the results
> can be found in step => "4. Build the kernel" => "Print kernel build
> resource usage". You may want to check them.

Ah thanks for the suggestion of using time -v, I have included some of
my statistics below as well.

> I am not aware of neat ways of alleviating this overhead fundamentally so I
> would love any advice on it. And perhaps now the more recommended way of
> using the proposed feature is to instrument and measure the kernel on a
> per-component basis.

I think the overhead is probably fine. Maybe this sort of thing could go
in the Kconfig text of LLVM_COV_PROFILE_ALL, both the note on the
overhead itself and the alternative of instrumenting particular bits of
code, which would potentially help with people who want to use this for
writing tests and such.

> [8] https://lpc.events/event/18/contributions/1895/attachments/1643/3462/LPC'24%20Source%20based%20(short).pdf
> [9] https://github.com/xlab-uiuc/linux-mcdc/actions
> 
> > would make sure. If not, it might be worth a comment somewhere that this
> > can also require some heftier build resources possibly? If that is not
> 
> Sure.
> 
> > expected, I am happy to help look into why it is happening.
> >
> > I was able to successfully build that same configuration and setup with
> > my primary workstation, which is much beefier. Unfortunately, the
> > resulting kernel did not boot with my usual VM testing setup. I will see
> > if I can narrow down a particular configuration option that causes this
> > tomorrow because I did a test with defconfig +
> > CONFIG_LLVM_COV_PROFILE_ALL and it booted fine. Perhaps some other
> > option that is not compatible with this? I'll follow up with more
> > information as I have it.
> 
> Good to hear that you've run it and thanks for reporting the booting issue.
> You may send me the config if appropriate and I'll also take a look.

I seem to have narrowed down it to a few different configurations on top
of x86_64_defconfig but I will include the full bad configuration as an
attachment just in case anything else is relevant.

$ echo 'CONFIG_LLVM_COV_KERNEL=y
CONFIG_LLVM_COV_PROFILE_ALL=y' >kernel/configs/llvm_cov.config

$ echo CONFIG_FORTIFY_SOURCE=y >kernel/configs/fortify_source.config

$ echo CONFIG_AMD_MEM_ENCRYPT=y >arch/x86/configs/amd_mem_encrypt.config

$ /usr/bin/time -v make -skj"$(nproc)" ARCH=x86_64 LLVM=1 mrproper {def,amd_mem_encrypt.,fortify_source.,llvm_cov.}config bzImage
...
vmlinux.o: warning: objtool: __sev_es_nmi_complete+0x6e: call to kasan_check_write() leaves .noinstr.text section
vmlinux.o: warning: objtool: do_syscall_64+0x141: call to lockdep_hardirqs_off() leaves .noinstr.text section
vmlinux.o: warning: objtool: do_int80_emulation+0x138: call to lockdep_hardirqs_off() leaves .noinstr.text section
vmlinux.o: warning: objtool: handle_bug+0x5: call to kmsan_unpoison_entry_regs() leaves .noinstr.text section
vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare+0x105: call to lockdep_hardirqs_off() leaves .noinstr.text section
vmlinux.o: warning: objtool: syscall_exit_to_user_mode+0x73: call to user_enter_irqoff() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_enter_from_user_mode+0x105: call to lockdep_hardirqs_off() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_exit_to_user_mode+0x62: call to user_enter_irqoff() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_enter+0x45: call to lockdep_hardirqs_off() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_exit+0x4a: call to lockdep_hardirqs_on() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_nmi_enter+0x4: call to lockdep_off() leaves .noinstr.text section
vmlinux.o: warning: objtool: irqentry_nmi_exit+0x67: call to lockdep_on() leaves .noinstr.text section
vmlinux.o: warning: objtool: enter_s2idle_proper+0xb5: call to lockdep_hardirqs_off() leaves .noinstr.text section
vmlinux.o: warning: objtool: cpuidle_enter_state+0x113: call to lockdep_hardirqs_off() leaves .noinstr.text section
vmlinux.o: warning: objtool: default_idle_call+0xad: call to lockdep_hardirqs_on() leaves .noinstr.text section
vmlinux.o: warning: objtool: cpu_idle_poll+0x29: call to lockdep_hardirqs_on() leaves .noinstr.text section
vmlinux.o: warning: objtool: acpi_idle_enter_bm+0x118: call to lockdep_hardirqs_on() leaves .noinstr.text section
vmlinux.o: warning: objtool: acpi_idle_do_entry+0x4: call to perf_lopwr_cb() leaves .noinstr.text section
...
        User time (seconds): 670.86
        System time (seconds): 459.05
        Percent of CPU this job got: 169%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 11:06.15
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 38644844
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 18694
        Minor (reclaiming a frame) page faults: 23068856
        Voluntary context switches: 32215431
        Involuntary context switches: 46422
        Swaps: 0
        File system inputs: 0
        File system outputs: 40127696
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

$ curl -LSs https://github.com/ClangBuiltLinux/boot-utils/releases/download/20230707-182910/x86_64-rootfs.cpio.zst | zstd -d >rootfs.cpio

$ qemu-system-x86_64 \
    -display none \
    -nodefaults \
    -M q35 \
    -d unimp,guest_errors \
    -append 'console=ttyS0 earlycon=uart8250,io,0x3f8' \
    -kernel arch/x86/boot/bzImage
    -initrd rootfs.cpio \
    -cpu host \
    -enable-kvm \
    -m 8G \
    -smp 8 \
    -serial mon:stdio
<hangs with no output>

Without llvm_cov.config, everything works fine:

$ /usr/bin/time -v make -skj"$(nproc)" ARCH=x86_64 LLVM=1 mrproper {def,amd_mem_encrypt.,fortify_source.}config bzImage
...
        User time (seconds): 3441.79
        System time (seconds): 558.98
        Percent of CPU this job got: 5796%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:09.02
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 898960
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 632
        Minor (reclaiming a frame) page faults: 66061122
        Voluntary context switches: 177952
        Involuntary context switches: 104860
        Swaps: 0
        File system inputs: 16
        File system outputs: 2632152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

$ qemu-system-x86_64 \
    -display none \
    -nodefaults \
    -M q35 \
    -d unimp,guest_errors \
    -append 'console=ttyS0 earlycon=uart8250,io,0x3f8' \
    -kernel arch/x86/boot/bzImage
    -initrd rootfs.cpio \
    -cpu host \
    -enable-kvm \
    -m 8G \
    -smp 8 \
    -serial mon:stdio
[    0.000000] Linux version 6.12.0-rc1-next-20241001-00005-g9918a23d3014 (nathan@...lio-3990X) (ClangBuiltLinux clang version 19.1.1 (https://github.com/llvm/llvm-project.git d401987fe349a87c53fe25829215b080b70c0c1a), ClangBuiltLinux LLD 19.1.1 (https://github.com/llvm/llvm-project.git d401987fe349a87c53fe25829215b080b70c0c1a)) #1 SMP PREEMPT_DYNAMIC Thu Oct  3 16:21:14 MST 2024
...
<shuts down successfully>

> > On the integration front, I think the -mm tree, run by Andrew Morton,
> > would probably be the best place to land this with Acks from the -tip
> > folks for the x86 bits? Once the issue above has been understood, I
> > think you can send v3 with any of the comments I made addressed and a
> > potential fix for the above issue if necessary directly to him, instead
> > of just on cc, so that it gets his attention. Other maintainers are free
> > to argue that it should go through their trees instead but I think it
> > would be good to decide on that sooner rather than later so this
> > patchset is not stuck in limbo.
> 
> Yeah -mm tree sounds good to me. Let me work on v3 while we address the
> booting issue and wait for others' opinions if any.

Another thing I noticed with this series is there is no entries added to
MAINTAINERS. Who is going to be responsible for maintaining this code?

Cheers,
Nathan

View attachment "config-bad" of type "text/plain" (173964 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ