lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250416085446.480069-1-glider@google.com>
Date: Wed, 16 Apr 2025 10:54:38 +0200
From: Alexander Potapenko <glider@...gle.com>
To: glider@...gle.com
Cc: quic_jiangenj@...cinc.com, linux-kernel@...r.kernel.org, 
	kasan-dev@...glegroups.com, Aleksandr Nogikh <nogikh@...gle.com>, 
	Andrey Konovalov <andreyknvl@...il.com>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, Dmitry Vyukov <dvyukov@...gle.com>, 
	Ingo Molnar <mingo@...hat.com>, Josh Poimboeuf <jpoimboe@...nel.org>, Marco Elver <elver@...gle.com>, 
	Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>
Subject: [PATCH 0/7] RFC: coverage deduplication for KCOV

As mentioned by Joey Jiao in [1], the current kcov implementation may
suffer from certain syscalls overflowing the userspace coverage buffer.

According to our measurements, among 24 syzkaller instances running
upstream Linux, 5 had a coverage overflow in at least 50% of executed
programs. The median percentage of programs with overflows across those 24
instances was 8.8%.

One way to mitigate this problem is to increase the size of the kcov buffer
in the userspace application using kcov. But right now syzkaller already
uses 4Mb per each of up to 32 threads to store the coverage, and increasing
it further would result in reduction in the number of executors on a single
machine.  Replaying the same program with an increased buffer size in the
case of overflow would also lead to fewer executions being possible.

When executing a single system call, excessive coverage usually stems from
loops, which write the same PCs into the output buffer repeatedly. Although
collecting precise traces may give us some insights into e.g. the number of
loop iterations and the branches being taken, the fuzzing engine does not
take advantage of these signals, and recording only unique PCs should be
just as practical.

In [1] Joey Jiao suggested using a hash table to deduplicate the coverage
signal on the kernel side. While being universally applicable to all types
of data collected by kcov, this approach adds another layer of complexity,
requiring dynamically growing the map. Another problem is potential hash
collisions, which can as well lead to lost coverage. Hash maps are also
unavoidably sparse, which potentially requires more memory.

The approach proposed in this patch series is to assign a unique (and
almost) sequential ID to each of the coverage callbacks in the kernel. Then
we carve out a fixed-sized bitmap from the userspace trace buffer, and on
every callback invocation we:

- obtain the callback_ID;
- if bitmap[callback_ID] is set, append the PC to the trace buffer;
- set bitmap[callback_ID] to true.

LLVM's -fsanitize-coverage=trace-pc-guard replaces every coverage callback
in the kernel with a call to
__sanitizer_cov_trace_pc_guard(&guard_variable) , where guard_variable is a
4-byte global that is unique for the callsite.

This allows us to lazily allocate sequential numbers just for the callbacks
that have actually been executed, using a lock-free algorithm.

This patch series implements a new config, CONFIG_KCOV_ENABLE_GUARDS, which
utilizes the mentioned LLVM flag for coverage instrumentation. In addition
to the existing coverage collection modes, it introduces
ioctl(KCOV_UNIQUE_ENABLE), which splits the existing kcov buffer into the
bitmap and the trace part for a particular fuzzing session, and collects
only unique coverage in the trace buffer.

To reset the coverage between runs, it is now necessary to set trace[0] to
0 AND clear the entire bitmap. This is still considered feasible, based on
the experimental results below.

The current design does not address the deduplication of KCOV_TRACE_CMP
comparisons; however, the number of kcov overflows during the hints
collection process is insignificant compared to the overflows of
KCOV_TRACE_PC.

In addition to the mentioned changes, this patch adds support for
R_X86_64_REX_GOTPCRELX to objtool and arch/x86/kernel/module.c.  It turned
out that Clang leaves such relocations in the linked modules for the
__start___sancov_guards and __stop___sancov_guards symbols. Because
resolving them does not require a .got section, it can be done at module
load time.

Experimental results.

We've conducted an experiment running syz-testbed [3] on 10 syzkaller
instances for 24 hours.  Out of those 10 instances, 5 were enabling the
kcov_deduplicate flag from [4], which makes use of the KCOV_UNIQUE_ENABLE
ioctl, reserving 4096 words (262144 bits) for the bitmap and leaving 520192
words for the trace collection.

Below are the average stats from the runs.

kcov_deduplicate=false:
  corpus: 52176
  coverage: 302658
  cover overflows: 225288
  comps overflows: 491
  exec total: 1417829
  max signal: 318894

kcov_deduplicate=true:
  corpus: 52581
  coverage: 304344
  cover overflows: 986
  comps overflows: 626
  exec total: 1484841
  max signal: 322455

[1] https://lore.kernel.org/linux-arm-kernel/20250114-kcov-v1-5-004294b931a2@quicinc.com/T/
[2] https://clang.llvm.org/docs/SanitizerCoverage.html
[3] https://github.com/google/syzkaller/tree/master/tools/syz-testbed
[4] https://github.com/ramosian-glider/linux/pull/7 


Alexander Potapenko (7):
  kcov: apply clang-format to kcov code
  kcov: factor out struct kcov_state
  kcov: x86: introduce CONFIG_KCOV_ENABLE_GUARDS
  kcov: add `trace` and `trace_size` to `struct kcov_state`
  kcov: add ioctl(KCOV_UNIQUE_ENABLE)
  x86: objtool: add support for R_X86_64_REX_GOTPCRELX
  mm/kasan: define __asan_before_dynamic_init, __asan_after_dynamic_init

 Documentation/dev-tools/kcov.rst  |  43 +++
 MAINTAINERS                       |   1 +
 arch/x86/include/asm/elf.h        |   1 +
 arch/x86/kernel/module.c          |   8 +
 arch/x86/kernel/vmlinux.lds.S     |   1 +
 arch/x86/um/asm/elf.h             |   1 +
 include/asm-generic/vmlinux.lds.h |  14 +-
 include/linux/kcov-state.h        |  46 +++
 include/linux/kcov.h              |  60 ++--
 include/linux/sched.h             |  16 +-
 include/uapi/linux/kcov.h         |   1 +
 kernel/kcov.c                     | 453 +++++++++++++++++++-----------
 lib/Kconfig.debug                 |  16 ++
 mm/kasan/generic.c                |  18 ++
 mm/kasan/kasan.h                  |   2 +
 scripts/Makefile.kcov             |   4 +
 scripts/module.lds.S              |  23 ++
 tools/objtool/arch/x86/decode.c   |   1 +
 tools/objtool/check.c             |   1 +
 19 files changed, 508 insertions(+), 202 deletions(-)
 create mode 100644 include/linux/kcov-state.h

-- 
2.49.0.604.gff1f9ca942-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ