[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAB9sUllq/xR/Maf@hu-jiangenj-sha.qualcomm.com>
Date: Thu, 17 Apr 2025 12:04:01 +0800
From: Joey Jiao <quic_jiangenj@...cinc.com>
To: Alexander Potapenko <glider@...gle.com>
CC: <linux-kernel@...r.kernel.org>, <kasan-dev@...glegroups.com>,
Aleksandr
Nogikh <nogikh@...gle.com>,
Andrey Konovalov <andreyknvl@...il.com>,
Borislav
Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Dmitry
Vyukov <dvyukov@...gle.com>, Ingo Molnar <mingo@...hat.com>,
Josh Poimboeuf
<jpoimboe@...nel.org>, Marco Elver <elver@...gle.com>,
Peter Zijlstra
<peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 0/7] RFC: coverage deduplication for KCOV
On Wed, Apr 16, 2025 at 10:54:38AM +0200, Alexander Potapenko wrote:
> As mentioned by Joey Jiao in [1], the current kcov implementation may
> suffer from certain syscalls overflowing the userspace coverage buffer.
>
> According to our measurements, among 24 syzkaller instances running
> upstream Linux, 5 had a coverage overflow in at least 50% of executed
> programs. The median percentage of programs with overflows across those 24
> instances was 8.8%.
>
> One way to mitigate this problem is to increase the size of the kcov buffer
> in the userspace application using kcov. But right now syzkaller already
> uses 4Mb per each of up to 32 threads to store the coverage, and increasing
> it further would result in reduction in the number of executors on a single
> machine. Replaying the same program with an increased buffer size in the
> case of overflow would also lead to fewer executions being possible.
>
> When executing a single system call, excessive coverage usually stems from
> loops, which write the same PCs into the output buffer repeatedly. Although
> collecting precise traces may give us some insights into e.g. the number of
> loop iterations and the branches being taken, the fuzzing engine does not
> take advantage of these signals, and recording only unique PCs should be
> just as practical.
>
> In [1] Joey Jiao suggested using a hash table to deduplicate the coverage
> signal on the kernel side. While being universally applicable to all types
> of data collected by kcov, this approach adds another layer of complexity,
> requiring dynamically growing the map. Another problem is potential hash
> collisions, which can as well lead to lost coverage. Hash maps are also
> unavoidably sparse, which potentially requires more memory.
>
> The approach proposed in this patch series is to assign a unique (and
> almost) sequential ID to each of the coverage callbacks in the kernel. Then
> we carve out a fixed-sized bitmap from the userspace trace buffer, and on
> every callback invocation we:
>
> - obtain the callback_ID;
> - if bitmap[callback_ID] is set, append the PC to the trace buffer;
> - set bitmap[callback_ID] to true.
>
> LLVM's -fsanitize-coverage=trace-pc-guard replaces every coverage callback
> in the kernel with a call to
> __sanitizer_cov_trace_pc_guard(&guard_variable) , where guard_variable is a
> 4-byte global that is unique for the callsite.
>
> This allows us to lazily allocate sequential numbers just for the callbacks
> that have actually been executed, using a lock-free algorithm.
>
> This patch series implements a new config, CONFIG_KCOV_ENABLE_GUARDS, which
> utilizes the mentioned LLVM flag for coverage instrumentation. In addition
> to the existing coverage collection modes, it introduces
> ioctl(KCOV_UNIQUE_ENABLE), which splits the existing kcov buffer into the
> bitmap and the trace part for a particular fuzzing session, and collects
> only unique coverage in the trace buffer.
>
> To reset the coverage between runs, it is now necessary to set trace[0] to
> 0 AND clear the entire bitmap. This is still considered feasible, based on
> the experimental results below.
>
> The current design does not address the deduplication of KCOV_TRACE_CMP
> comparisons; however, the number of kcov overflows during the hints
> collection process is insignificant compared to the overflows of
> KCOV_TRACE_PC.
>
> In addition to the mentioned changes, this patch adds support for
> R_X86_64_REX_GOTPCRELX to objtool and arch/x86/kernel/module.c. It turned
> out that Clang leaves such relocations in the linked modules for the
> __start___sancov_guards and __stop___sancov_guards symbols. Because
> resolving them does not require a .got section, it can be done at module
> load time.
>
> Experimental results.
>
> We've conducted an experiment running syz-testbed [3] on 10 syzkaller
> instances for 24 hours. Out of those 10 instances, 5 were enabling the
> kcov_deduplicate flag from [4], which makes use of the KCOV_UNIQUE_ENABLE
> ioctl, reserving 4096 words (262144 bits) for the bitmap and leaving 520192
> words for the trace collection.
>
> Below are the average stats from the runs.
Is there test without trace collection? Is bitmap only enough?
>
> kcov_deduplicate=false:
> corpus: 52176
> coverage: 302658
> cover overflows: 225288
> comps overflows: 491
> exec total: 1417829
> max signal: 318894
>
> kcov_deduplicate=true:
> corpus: 52581
> coverage: 304344
> cover overflows: 986
> comps overflows: 626
> exec total: 1484841
> max signal: 322455
>
> [1] https://lore.kernel.org/linux-arm-kernel/20250114-kcov-v1-5-004294b931a2@quicinc.com/T/
> [2] https://clang.llvm.org/docs/SanitizerCoverage.html
> [3] https://github.com/google/syzkaller/tree/master/tools/syz-testbed
> [4] https://github.com/ramosian-glider/linux/pull/7
>
>
> Alexander Potapenko (7):
> kcov: apply clang-format to kcov code
> kcov: factor out struct kcov_state
> kcov: x86: introduce CONFIG_KCOV_ENABLE_GUARDS
> kcov: add `trace` and `trace_size` to `struct kcov_state`
> kcov: add ioctl(KCOV_UNIQUE_ENABLE)
> x86: objtool: add support for R_X86_64_REX_GOTPCRELX
> mm/kasan: define __asan_before_dynamic_init, __asan_after_dynamic_init
>
> Documentation/dev-tools/kcov.rst | 43 +++
> MAINTAINERS | 1 +
> arch/x86/include/asm/elf.h | 1 +
> arch/x86/kernel/module.c | 8 +
> arch/x86/kernel/vmlinux.lds.S | 1 +
> arch/x86/um/asm/elf.h | 1 +
> include/asm-generic/vmlinux.lds.h | 14 +-
> include/linux/kcov-state.h | 46 +++
> include/linux/kcov.h | 60 ++--
> include/linux/sched.h | 16 +-
> include/uapi/linux/kcov.h | 1 +
> kernel/kcov.c | 453 +++++++++++++++++++-----------
> lib/Kconfig.debug | 16 ++
> mm/kasan/generic.c | 18 ++
> mm/kasan/kasan.h | 2 +
> scripts/Makefile.kcov | 4 +
> scripts/module.lds.S | 23 ++
> tools/objtool/arch/x86/decode.c | 1 +
> tools/objtool/check.c | 1 +
> 19 files changed, 508 insertions(+), 202 deletions(-)
> create mode 100644 include/linux/kcov-state.h
>
> --
> 2.49.0.604.gff1f9ca942-goog
>
Powered by blists - more mailing lists