lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 12 Jan 2016 11:27:50 -0800
From:	Kees Cook <keescook@...gle.com>
To:	Dmitry Vyukov <dvyukov@...gle.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	David Drysdale <drysdale@...gle.com>,
	Quentin Casasnovas <quentin.casasnovas@...cle.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	Vegard Nossum <vegard.nossum@...cle.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Eric Dumazet <edumazet@...gle.com>,
	Tavis Ormandy <taviso@...gle.com>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	syzkaller <syzkaller@...glegroups.com>,
	Kostya Serebryany <kcc@...gle.com>,
	Alexander Potapenko <glider@...gle.com>,
	Andrey Ryabinin <ryabinin.a.a@...il.com>
Subject: Re: [PATCH] kernel: add kcov code coverage

On Tue, Jan 12, 2016 at 11:19 AM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
> On Tue, Jan 12, 2016 at 6:31 PM, Kees Cook <keescook@...gle.com> wrote:
>> On Tue, Jan 12, 2016 at 8:15 AM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
>>> kcov provides code coverage collection for coverage-guided fuzzing
>>> (randomized testing). Coverage-guided fuzzing is a testing technique
>>> that uses coverage feedback to determine new interesting inputs to a
>>> system. A notable user-space example is AFL
>>> (http://lcamtuf.coredump.cx/afl/). However, this technique is not
>>> widely used for kernel testing due to missing compiler and kernel
>>> support.
>>>
>>> kcov does not aim to collect as much coverage as possible. It aims
>>> to collect more or less stable coverage that is function of syscall
>>> inputs. To achieve this goal it does not collect coverage in
>>> soft/hard interrupts and instrumentation of some inherently
>>> non-deterministic or non-interesting parts of kernel is disbled
>>> (e.g. scheduler, locking).
>>>
>>> Currently there is a single coverage collection mode (tracing),
>>> but the API anticipates additional collection modes.
>>> Initially I also implemented a second mode which exposes
>>> coverage in a fixed-size hash table of counters (what Quentin
>>> used in his original patch). I've dropped the second mode for
>>> simplicity.
>>>
>>> This patch adds the necessary support on kernel side.
>>> The complimentary compiler support was added in gcc revision 231296.
>>>
>>> We've used this support to build syzkaller system call fuzzer,
>>> which has found 90 kernel bugs in just 2 months:
>>> https://github.com/google/syzkaller/wiki/Found-Bugs
>>> We've also found 30+ bugs in our internal systems with syzkaller.
>>> Another (yet unexplored) direction where kcov coverage would greatly
>>> help is more traditional "blob mutation". For example, mounting
>>> a random blob as a filesystem, or receiving a random blob over wire.
>>>
>>> Why not gcov. Typical fuzzing loop looks as follows: (1) reset
>>> coverage, (2) execute a bit of code, (3) collect coverage, repeat.
>>> A typical coverage can be just a dozen of basic blocks (e.g. an
>>> invalid input). In such context gcov becomes prohibitively expensive
>>> as reset/collect coverage steps depend on total number of basic
>>> blocks/edges in program (in case of kernel it is about 2M). Cost of
>>> kcov depends only on number of executed basic blocks/edges. On top of
>>> that, kernel requires per-thread coverage because there are
>>> always background threads and unrelated processes that also produce
>>> coverage. With inlined gcov instrumentation per-thread coverage is not
>>> possible.
>>>
>>> Based on a patch by Quentin Casasnovas.
>>> Signed-off-by: Dmitry Vyukov <dvyukov@...gle.com>
>>
>> Reviewed-by: Kees Cook <keescook@...omium.org>
>>
>>> ---
>>> Anticipating reasonable questions regarding usage of this feature.
>>> Quentin Casasnovas and Vegard Nossum also plan to use kcov for
>>> coverage-guided fuzzing. Currently they use a custom kernel patch
>>> for their fuzzer and found several dozens of bugs.
>>> There is also interest from Intel 0-DAY kernel test infrastructure.
>>>
>>> Based on commit 03891f9c853d5c4473224478a1e03ea00d70ff8d.
>>> ---
>>>  Documentation/kcov.txt            | 111 +++++++++++++++
>>>  Makefile                          |  10 +-
>>>  arch/x86/Kconfig                  |   1 +
>>>  arch/x86/boot/Makefile            |   6 +
>>>  arch/x86/boot/compressed/Makefile |   2 +
>>>  arch/x86/entry/vdso/Makefile      |   2 +
>>>  arch/x86/kernel/Makefile          |   5 +
>>>  arch/x86/kernel/apic/Makefile     |   4 +
>>>  arch/x86/kernel/cpu/Makefile      |   4 +
>>>  arch/x86/lib/Makefile             |   3 +
>>>  arch/x86/mm/Makefile              |   3 +
>>>  arch/x86/realmode/rm/Makefile     |   2 +
>>>  include/linux/kcov.h              |  19 +++
>>>  include/linux/sched.h             |  10 ++
>>>  include/uapi/linux/kcov.h         |  10 ++
>>>  kernel/Makefile                   |   9 ++
>>>  kernel/exit.c                     |   2 +
>>>  kernel/fork.c                     |   3 +
>>>  kernel/kcov/Makefile              |   5 +
>>>  kernel/kcov/kcov.c                | 287 ++++++++++++++++++++++++++++++++++++++
>>>  kernel/locking/Makefile           |   3 +
>>>  kernel/rcu/Makefile               |   4 +
>>>  kernel/sched/Makefile             |   4 +
>>>  lib/Kconfig.debug                 |  27 ++++
>>>  lib/Makefile                      |   9 ++
>>>  mm/Makefile                       |  15 ++
>>>  mm/kasan/Makefile                 |   1 +
>>>  scripts/Makefile.lib              |   6 +
>>>  28 files changed, 566 insertions(+), 1 deletion(-)
>>>  create mode 100644 Documentation/kcov.txt
>>>  create mode 100644 include/linux/kcov.h
>>>  create mode 100644 include/uapi/linux/kcov.h
>>>  create mode 100644 kernel/kcov/Makefile
>>>  create mode 100644 kernel/kcov/kcov.c
>>>
>>> diff --git a/Documentation/kcov.txt b/Documentation/kcov.txt
>>> new file mode 100644
>>> index 0000000..1fa6a3d
>>> --- /dev/null
>>> +++ b/Documentation/kcov.txt
>>> @@ -0,0 +1,111 @@
>>> +kcov: code coverage for fuzzing
>>> +===============================
>>> +
>>> +kcov exposes kernel code coverage information in a form suitable for coverage-
>>> +guided fuzzing (randomized testing). Coverage data of a running kernel is
>>> +exported via the "kcov" debugfs file. Coverage collection is enabled on a task
>>> +basis, and thus it can capture precise coverage of a single system call.
>>> +
>>> +Note that kcov does not aim to collect as much coverage as possible. It aims
>>> +to collect more or less stable coverage that is function of syscall inputs.
>>> +To achieve this goal it does not collect coverage in soft/hard interrupts
>>> +and instrumentation of some inherently non-deterministic parts of kernel is
>>> +disbled (e.g. scheduler, locking).
>>> +
>>> +Usage:
>>> +======
>>> +
>>> +Configure kernel with:
>>> +
>>> +        CONFIG_KCOV=y
>>> +        CONFIG_DEBUG_FS=y
>>> +
>>> +CONFIG_KCOV requires gcc built on revision 231296 or later.
>>> +Profiling data will only become accessible once debugfs has been mounted:
>>> +
>>> +        mount -t debugfs none /sys/kernel/debug
>>> +
>>> +The following program demonstrates kcov usage from within a test program:
>>> +
>>> +#include <stdio.h>
>>> +#include <stddef.h>
>>> +#include <stdint.h>
>>> +#include <sys/types.h>
>>> +#include <sys/stat.h>
>>> +#include <sys/ioctl.h>
>>> +#include <sys/mman.h>
>>> +#include <fcntl.h>
>>> +
>>> +#define KCOV_INIT_TRACE                        _IOR('c', 1, unsigned long)
>>> +#define KCOV_ENABLE                    _IO('c', 100)
>>> +#define KCOV_DISABLE                   _IO('c', 101)
>>> +#define COVER_SIZE                     (64<<10)
>>> +
>>> +int main(int argc, char **argv)
>>> +{
>>> +       int fd;
>>> +       uint32_t *cover, n, i;
>>> +
>>> +       /* A single fd descriptor allows coverage collection on a single
>>> +        * thread.
>>> +        */
>>> +       fd = open("/sys/kernel/debug/kcov", O_RDWR);
>>> +       if (fd == -1)
>>> +               perror("open");
>>> +       /* Setup trace mode and trace size. */
>>> +       if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE))
>>> +               perror("ioctl");
>>> +       /* Mmap buffer shared between kernel- and user-space. */
>>> +       cover = (uint32_t*)mmap(NULL, COVER_SIZE * sizeof(uint32_t),
>>> +                               PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
>>> +       if ((void*)cover == MAP_FAILED)
>>> +               perror("mmap");
>>> +       /* Enable coverage collection on the current thread. */
>>> +       if (ioctl(fd, KCOV_ENABLE, 0))
>>> +               perror("ioctl");
>>> +       /* Reset coverage from the tail of the ioctl() call. */
>>> +       __atomic_store_n(&cover[0], 0, __ATOMIC_RELAXED);
>>> +       /* That's the target syscal call. */
>>> +       read(-1, NULL, 0);
>>> +       /* Read number of PCs collected. */
>>> +       n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED);
>>> +       /* PCs are shorten to uint32_t, so we need to restore the upper part. */
>>> +       for (i = 0; i < n; i++)
>>> +               printf("0xffffffff%0lx\n", (unsigned long)cover[i + 1]);
>>> +       /* Disable coverage collection for the current thread. After this call
>>> +        * coverage can be enabled for a different thread.
>>> +        */
>>> +       if (ioctl(fd, KCOV_DISABLE, 0))
>>> +               perror("ioctl");
>>> +       /* Free resources. */
>>> +       if (munmap(cover, COVER_SIZE * sizeof(uint32_t)))
>>> +               perror("munmap");
>>> +       if (close(fd))
>>> +               perror("close");
>>> +       return 0;
>>> +}
>>> +
>>> +After piping through addr2line output of the program looks as follows:
>>> +
>>> +SyS_read
>>> +fs/read_write.c:562
>>> +__fdget_pos
>>> +fs/file.c:774
>>> +__fget_light
>>> +fs/file.c:746
>>> +__fget_light
>>> +fs/file.c:750
>>> +__fget_light
>>> +fs/file.c:760
>>> +__fdget_pos
>>> +fs/file.c:784
>>> +SyS_read
>>> +fs/read_write.c:562
>>> +
>>> +If a program needs to collect coverage from several threads (independently),
>>> +it needs to open /sys/kernel/debug/kcov in each thread separately.
>>> +
>>> +The interface is fine-grained to allow efficient forking of test processes.
>>> +That is, a parent process opens /sys/kernel/debug/kcov, enables trace mode,
>>> +mmaps coverage buffer and then forks child processes in a loop. Child processes
>>> +only need to enable coverage (disable happens automatically on thread end).
>>> diff --git a/Makefile b/Makefile
>>> index 70dea02..9fe404a 100644
>>> --- a/Makefile
>>> +++ b/Makefile
>>> @@ -365,6 +365,7 @@ LDFLAGS_MODULE  =
>>>  CFLAGS_KERNEL  =
>>>  AFLAGS_KERNEL  =
>>>  CFLAGS_GCOV    = -fprofile-arcs -ftest-coverage
>>> +CFLAGS_KCOV    = -fsanitize-coverage=trace-pc
>>>
>>>
>>>  # Use USERINCLUDE when you must reference the UAPI directories only.
>>> @@ -411,7 +412,7 @@ export MAKE AWK GENKSYMS INSTALLKERNEL PERL PYTHON UTS_MACHINE
>>>  export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
>>>
>>>  export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS
>>> -export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_GCOV CFLAGS_KASAN
>>> +export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE CFLAGS_GCOV CFLAGS_KCOV CFLAGS_KASAN
>>>  export KBUILD_AFLAGS AFLAGS_KERNEL AFLAGS_MODULE
>>>  export KBUILD_AFLAGS_MODULE KBUILD_CFLAGS_MODULE KBUILD_LDFLAGS_MODULE
>>>  export KBUILD_AFLAGS_KERNEL KBUILD_CFLAGS_KERNEL
>>> @@ -667,6 +668,13 @@ endif
>>>  endif
>>>  KBUILD_CFLAGS += $(stackp-flag)
>>>
>>> +ifdef CONFIG_KCOV
>>> +  ifeq ($(call cc-option, $(CFLAGS_KCOV)),)
>>> +    $(warning Cannot use CONFIG_KCOV: \
>>> +             -fsanitize-coverage=trace-pc is not supported by compiler)
>>> +  endif
>>> +endif
>>> +
>>>  ifeq ($(cc-name),clang)
>>>  KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
>>>  KBUILD_CPPFLAGS += $(call cc-option,-Wno-unknown-warning-option,)
>>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>>> index 258965d..be39ab5 100644
>>> --- a/arch/x86/Kconfig
>>> +++ b/arch/x86/Kconfig
>>> @@ -27,6 +27,7 @@ config X86
>>>         select ARCH_HAS_ELF_RANDOMIZE
>>>         select ARCH_HAS_FAST_MULTIPLIER
>>>         select ARCH_HAS_GCOV_PROFILE_ALL
>>> +       select ARCH_HAS_KCOV                    if X86_64
>>>         select ARCH_HAS_PMEM_API                if X86_64
>>>         select ARCH_HAS_MMIO_FLUSH
>>>         select ARCH_HAS_SG_CHAIN
>>> diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
>>> index 2ee62db..b2eb295 100644
>>> --- a/arch/x86/boot/Makefile
>>> +++ b/arch/x86/boot/Makefile
>>> @@ -10,6 +10,12 @@
>>>  #
>>>
>>>  KASAN_SANITIZE := n
>>> +# Kernel does not boot with kcov instrumentation here.
>>> +# One of the problems observed was insertion of __sanitizer_cov_trace_pc()
>>> +# callback into middle of per-cpu data enabling code. Thus the callback observed
>>> +# inconsistent state and crashed. We are interested mostly in syscall coverage,
>>> +# so boot code is not interesting anyway.
>>> +KCOV_INSTRUMENT := n
>>>
>>>  # If you want to preset the SVGA mode, uncomment the next line and
>>>  # set SVGA_MODE to whatever number you want.
>>> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
>>> index 0a291cd..e625939 100644
>>> --- a/arch/x86/boot/compressed/Makefile
>>> +++ b/arch/x86/boot/compressed/Makefile
>>> @@ -17,6 +17,8 @@
>>>  #      compressed vmlinux.bin.all + u32 size of vmlinux.bin.all
>>>
>>>  KASAN_SANITIZE := n
>>> +# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in.
>>> +KCOV_INSTRUMENT := n
>>>
>>>  targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \
>>>         vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4
>>> diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
>>> index 265c0ed..1b663b8 100644
>>> --- a/arch/x86/entry/vdso/Makefile
>>> +++ b/arch/x86/entry/vdso/Makefile
>>> @@ -4,6 +4,8 @@
>>>
>>>  KBUILD_CFLAGS += $(DISABLE_LTO)
>>>  KASAN_SANITIZE := n
>>> +# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in.
>>> +KCOV_INSTRUMENT := n
>>>
>>>  VDSO64-$(CONFIG_X86_64)                := y
>>>  VDSOX32-$(CONFIG_X86_X32_ABI)  := y
>>> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
>>> index b1b78ff..4648960 100644
>>> --- a/arch/x86/kernel/Makefile
>>> +++ b/arch/x86/kernel/Makefile
>>> @@ -19,6 +19,11 @@ endif
>>>  KASAN_SANITIZE_head$(BITS).o := n
>>>  KASAN_SANITIZE_dumpstack.o := n
>>>  KASAN_SANITIZE_dumpstack_$(BITS).o := n
>>> +# If instrumentation of this dir is enabled, boot hangs during first second.
>>> +# Probably could be more selective here, but note that files related to irqs,
>>> +# boot, dumpstack/stacktrace, etc are either non-interesting or can lead to
>>> +# non-deterministic coverage.
>>> +KCOV_INSTRUMENT := n
>>>
>>>  CFLAGS_irq.o := -I$(src)/../include/asm/trace
>>>
>>> diff --git a/arch/x86/kernel/apic/Makefile b/arch/x86/kernel/apic/Makefile
>>> index 8bb12dd..8f2a3d7 100644
>>> --- a/arch/x86/kernel/apic/Makefile
>>> +++ b/arch/x86/kernel/apic/Makefile
>>> @@ -2,6 +2,10 @@
>>>  # Makefile for local APIC drivers and for the IO-APIC code
>>>  #
>>>
>>> +# Leads to non-deterministic coverage that is not a function of syscall inputs.
>>> +# In particualr, smp_apic_timer_interrupt() is called in random places.
>>> +KCOV_INSTRUMENT := n
>>> +
>>>  obj-$(CONFIG_X86_LOCAL_APIC)   += apic.o apic_noop.o ipi.o vector.o
>>>  obj-y                          += hw_nmi.o
>>>
>>> diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
>>> index 5803130..c108683 100644
>>> --- a/arch/x86/kernel/cpu/Makefile
>>> +++ b/arch/x86/kernel/cpu/Makefile
>>> @@ -8,6 +8,10 @@ CFLAGS_REMOVE_common.o = -pg
>>>  CFLAGS_REMOVE_perf_event.o = -pg
>>>  endif
>>>
>>> +# If these files are instrumented, boot hangs during the first second.
>>> +KCOV_INSTRUMENT_common.o := n
>>> +KCOV_INSTRUMENT_perf_event.o := n
>>> +
>>>  # Make sure load_percpu_segment has no stackprotector
>>>  nostackp := $(call cc-option, -fno-stack-protector)
>>>  CFLAGS_common.o                := $(nostackp)
>>> diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
>>> index a501fa2..fefca94 100644
>>> --- a/arch/x86/lib/Makefile
>>> +++ b/arch/x86/lib/Makefile
>>> @@ -2,6 +2,9 @@
>>>  # Makefile for x86 specific library files.
>>>  #
>>>
>>> +# Produces uninteresting flaky coverage.
>>> +KCOV_INSTRUMENT_delay.o := n
>>> +
>>>  inat_tables_script = $(srctree)/arch/x86/tools/gen-insn-attr-x86.awk
>>>  inat_tables_maps = $(srctree)/arch/x86/lib/x86-opcode-map.txt
>>>  quiet_cmd_inat_tables = GEN     $@
>>> diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
>>> index f9d38a4..147def6 100644
>>> --- a/arch/x86/mm/Makefile
>>> +++ b/arch/x86/mm/Makefile
>>> @@ -1,3 +1,6 @@
>>> +# Kernel does not boot with instrumentation of tlb.c.
>>> +KCOV_INSTRUMENT_tlb.o := n
>>> +
>>>  obj-y  :=  init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \
>>>             pat.o pgtable.o physaddr.o gup.o setup_nx.o
>>>
>>> diff --git a/arch/x86/realmode/rm/Makefile b/arch/x86/realmode/rm/Makefile
>>> index 2730d77..2abf667 100644
>>> --- a/arch/x86/realmode/rm/Makefile
>>> +++ b/arch/x86/realmode/rm/Makefile
>>> @@ -7,6 +7,8 @@
>>>  #
>>>  #
>>>  KASAN_SANITIZE := n
>>> +# Prevents link failures: __sanitizer_cov_trace_pc() is not linked in.
>>> +KCOV_INSTRUMENT := n
>>>
>>>  always := realmode.bin realmode.relocs
>>>
>>> diff --git a/include/linux/kcov.h b/include/linux/kcov.h
>>> new file mode 100644
>>> index 0000000..72ff663
>>> --- /dev/null
>>> +++ b/include/linux/kcov.h
>>> @@ -0,0 +1,19 @@
>>> +#ifndef _LINUX_KCOV_H
>>> +#define _LINUX_KCOV_H
>>> +
>>> +#include <uapi/linux/kcov.h>
>>> +
>>> +struct task_struct;
>>> +
>>> +#ifdef CONFIG_KCOV
>>> +
>>> +void kcov_task_init(struct task_struct *t);
>>> +void kcov_task_exit(struct task_struct *t);
>>> +
>>> +#else
>>> +
>>> +static inline void kcov_task_init(struct task_struct *t) {}
>>> +static inline void kcov_task_exit(struct task_struct *t) {}
>>> +
>>> +#endif /* CONFIG_KCOV */
>>> +#endif /* _LINUX_KCOV_H */
>>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>>> index 4bae8ab..299d0180 100644
>>> --- a/include/linux/sched.h
>>> +++ b/include/linux/sched.h
>>> @@ -1806,6 +1806,16 @@ struct task_struct {
>>>         /* bitmask and counter of trace recursion */
>>>         unsigned long trace_recursion;
>>>  #endif /* CONFIG_TRACING */
>>> +#ifdef CONFIG_KCOV
>>> +       /* Coverage collection mode enabled for this task (0 if disabled). */
>>> +       int             kcov_mode;
>>> +       /* Size of the kcov_area. */
>>> +       unsigned long   kcov_size;
>>> +       /* Buffer for coverage collection. */
>>> +       void            *kcov_area;
>>> +       /* kcov desciptor wired with this task or NULL. */
>>> +       void            *kcov;
>>> +#endif
>>>  #ifdef CONFIG_MEMCG
>>>         struct mem_cgroup *memcg_in_oom;
>>>         gfp_t memcg_oom_gfp_mask;
>>> diff --git a/include/uapi/linux/kcov.h b/include/uapi/linux/kcov.h
>>> new file mode 100644
>>> index 0000000..574e22e
>>> --- /dev/null
>>> +++ b/include/uapi/linux/kcov.h
>>> @@ -0,0 +1,10 @@
>>> +#ifndef _LINUX_KCOV_IOCTLS_H
>>> +#define _LINUX_KCOV_IOCTLS_H
>>> +
>>> +#include <linux/types.h>
>>> +
>>> +#define KCOV_INIT_TRACE                        _IOR('c', 1, unsigned long)
>>> +#define KCOV_ENABLE                    _IO('c', 100)
>>> +#define KCOV_DISABLE                   _IO('c', 101)
>>> +
>>> +#endif /* _LINUX_KCOV_IOCTLS_H */
>>> diff --git a/kernel/Makefile b/kernel/Makefile
>>> index 53abf00..db7278b 100644
>>> --- a/kernel/Makefile
>>> +++ b/kernel/Makefile
>>> @@ -19,6 +19,14 @@ CFLAGS_REMOVE_cgroup-debug.o = $(CC_FLAGS_FTRACE)
>>>  CFLAGS_REMOVE_irq_work.o = $(CC_FLAGS_FTRACE)
>>>  endif
>>>
>>> +# Prevents flicker of uninteresting __do_softirq()/__local_bh_disable_ip()
>>> +# in coverage traces.
>>> +KCOV_INSTRUMENT_softirq.o := n
>>> +# These are called from save_stack_trace() on slub debug path,
>>> +# and produce insane amounts of uninteresting coverage.
>>> +KCOV_INSTRUMENT_module.o := n
>>> +KCOV_INSTRUMENT_extable.o := n
>>> +
>>>  # cond_syscall is currently not LTO compatible
>>>  CFLAGS_sys_ni.o = $(DISABLE_LTO)
>>>
>>> @@ -69,6 +77,7 @@ obj-$(CONFIG_AUDITSYSCALL) += auditsc.o
>>>  obj-$(CONFIG_AUDIT_WATCH) += audit_watch.o audit_fsnotify.o
>>>  obj-$(CONFIG_AUDIT_TREE) += audit_tree.o
>>>  obj-$(CONFIG_GCOV_KERNEL) += gcov/
>>> +obj-$(CONFIG_KCOV) += kcov/
>>>  obj-$(CONFIG_KPROBES) += kprobes.o
>>>  obj-$(CONFIG_KGDB) += debug/
>>>  obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o
>>> diff --git a/kernel/exit.c b/kernel/exit.c
>>> index 07110c6..49a1339 100644
>>> --- a/kernel/exit.c
>>> +++ b/kernel/exit.c
>>> @@ -53,6 +53,7 @@
>>>  #include <linux/oom.h>
>>>  #include <linux/writeback.h>
>>>  #include <linux/shm.h>
>>> +#include <linux/kcov.h>
>>>
>>>  #include <asm/uaccess.h>
>>>  #include <asm/unistd.h>
>>> @@ -657,6 +658,7 @@ void do_exit(long code)
>>>         TASKS_RCU(int tasks_rcu_i);
>>>
>>>         profile_task_exit(tsk);
>>> +       kcov_task_exit(tsk);
>>>
>>>         WARN_ON(blk_needs_flush_plug(tsk));
>>>
>>> diff --git a/kernel/fork.c b/kernel/fork.c
>>> index 291b08c..6b28993 100644
>>> --- a/kernel/fork.c
>>> +++ b/kernel/fork.c
>>> @@ -75,6 +75,7 @@
>>>  #include <linux/aio.h>
>>>  #include <linux/compiler.h>
>>>  #include <linux/sysctl.h>
>>> +#include <linux/kcov.h>
>>>
>>>  #include <asm/pgtable.h>
>>>  #include <asm/pgalloc.h>
>>> @@ -384,6 +385,8 @@ static struct task_struct *dup_task_struct(struct task_struct *orig)
>>>
>>>         account_kernel_stack(ti, 1);
>>>
>>> +       kcov_task_init(tsk);
>>> +
>>>         return tsk;
>>>
>>>  free_ti:
>>> diff --git a/kernel/kcov/Makefile b/kernel/kcov/Makefile
>>> new file mode 100644
>>> index 0000000..88892b7
>>> --- /dev/null
>>> +++ b/kernel/kcov/Makefile
>>> @@ -0,0 +1,5 @@
>>> +KCOV_INSTRUMENT := n
>>> +KASAN_SANITIZE := n
>>> +
>>> +obj-y := kcov.o
>>> +
>>> diff --git a/kernel/kcov/kcov.c b/kernel/kcov/kcov.c
>>> new file mode 100644
>>> index 0000000..05ec361
>>> --- /dev/null
>>> +++ b/kernel/kcov/kcov.c
>>> @@ -0,0 +1,287 @@
>>> +#define pr_fmt(fmt) "kcov: " fmt
>>> +
>>> +#include <linux/compiler.h>
>>> +#include <linux/types.h>
>>> +#include <linux/file.h>
>>> +#include <linux/fs.h>
>>> +#include <linux/mm.h>
>>> +#include <linux/printk.h>
>>> +#include <linux/slab.h>
>>> +#include <linux/spinlock.h>
>>> +#include <linux/vmalloc.h>
>>> +#include <linux/debugfs.h>
>>> +#include <linux/uaccess.h>
>>> +#include <linux/kcov.h>
>>> +
>>> +enum kcov_mode {
>>> +       /* Tracing coverage collection mode.
>>> +        * Covered PCs are collected in a per-task buffer.
>>> +        */
>>> +       kcov_mode_trace = 1,
>>> +};
>>> +
>>> +/* kcov descriptor (one per opened debugfs file). */
>>> +struct kcov {
>>> +       /* Reference counter. We keep one for:
>>> +        *  - opened file descriptor
>>> +        *  - mmapped region (including copies after fork)
>>> +        *  - task with enabled coverage (we can't unwire it from another task)
>>> +        */
>>> +       atomic_t                rc;
>>> +       /* The lock protects state transitions of the descriptor:
>>> +        *  - initial state after open()
>>> +        *  - then there must be a single ioctl(KCOV_INIT_TRACE) call
>>> +        *  - then, mmap() call (several calls are allowed but not useful)
>>> +        *  - then, repeated enable/disable for a task (only one task a time
>>> +        *    allowed
>>> +        */
>>> +       spinlock_t              lock;
>>> +       enum kcov_mode          mode;
>>> +       unsigned long           size;
>>> +       void                    *area;
>>> +       struct task_struct      *t;
>>> +};
>>> +
>>> +/* Entry point from instrumented code.
>>> + * This is called once per basic-block/edge.
>>> + */
>>> +void __sanitizer_cov_trace_pc(void)
>>> +{
>>> +       struct task_struct *t;
>>> +       enum kcov_mode mode;
>>> +
>>> +       t = current;
>>> +       /* We are interested in code coverage as a function of a syscall inputs,
>>> +        * so we ignore code executed in interrupts.
>>> +        */
>>> +       if (!t || in_interrupt())
>>> +               return;
>>> +       mode = READ_ONCE(t->kcov_mode);
>>> +       if (mode == kcov_mode_trace) {
>>> +               u32 *area;
>>> +               u32 pos;
>>> +
>>> +               /* There is some code that runs in interrupts but for which
>>> +                * in_interrupt() returns false (e.g. preempt_schedule_irq()).
>>> +                * READ_ONCE()/barrier() effectively provides load-acquire wrt
>>> +                * interrupts, there are paired barrier()/WRITE_ONCE() in
>>> +                * kcov_ioctl_locked().
>>> +                */
>>> +               barrier();
>>> +               area = t->kcov_area;
>>> +               /* The first u32 is number of subsequent PCs. */
>>> +               pos = READ_ONCE(area[0]) + 1;
>>> +               if (likely(pos < t->kcov_size)) {
>>> +                       area[pos] = (u32)_RET_IP_;
>>> +                       WRITE_ONCE(area[0], pos);
>>> +               }
>>> +       }
>>> +}
>>> +EXPORT_SYMBOL(__sanitizer_cov_trace_pc);
>>> +
>>> +static void kcov_put(struct kcov *kcov)
>>> +{
>>> +       if (atomic_dec_and_test(&kcov->rc)) {
>>> +               vfree(kcov->area);
>>> +               kfree(kcov);
>>> +       }
>>> +}
>>> +
>>> +void kcov_task_init(struct task_struct *t)
>>> +{
>>> +       t->kcov_mode = 0;
>>> +       t->kcov_size = 0;
>>> +       t->kcov_area = NULL;
>>> +       t->kcov = NULL;
>>> +}
>>> +
>>> +void kcov_task_exit(struct task_struct *t)
>>> +{
>>> +       struct kcov *kcov;
>>> +
>>> +       kcov = t->kcov;
>>> +       if (kcov == NULL)
>>> +               return;
>>> +       spin_lock(&kcov->lock);
>>> +       BUG_ON(kcov->t != t);
>>> +       /* Just to not leave dangling references behind. */
>>> +       kcov_task_init(t);
>>> +       kcov->t = NULL;
>>> +       spin_unlock(&kcov->lock);
>>> +       kcov_put(kcov);
>>> +}
>>> +
>>> +static int kcov_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>>> +{
>>> +       struct kcov *kcov;
>>> +       unsigned long off;
>>> +       struct page *page;
>>> +
>>> +       /* Map the preallocated kcov->area. */
>>> +       kcov = vma->vm_file->private_data;
>>> +       off = vmf->pgoff << PAGE_SHIFT;
>>> +       if (off >= kcov->size * sizeof(u32))
>>> +               return -1;
>>> +
>>> +       page = vmalloc_to_page(kcov->area + off);
>>> +       get_page(page);
>>> +       vmf->page = page;
>>> +       return 0;
>>> +}
>>> +
>>> +static void kcov_unmap(struct vm_area_struct *vma)
>>> +{
>>> +       kcov_put(vma->vm_file->private_data);
>>> +}
>>> +
>>> +static void kcov_map_copied(struct vm_area_struct *vma)
>>> +{
>>> +       struct kcov *kcov;
>>> +
>>> +       kcov = vma->vm_file->private_data;
>>> +       atomic_inc(&kcov->rc);
>>> +}
>>> +
>>> +static const struct vm_operations_struct kcov_vm_ops = {
>>> +       .fault = kcov_vm_fault,
>>> +       .close = kcov_unmap,
>>> +       /* Called on fork()/clone() when the mapping is copied. */
>>> +       .open  = kcov_map_copied,
>>> +};
>>> +
>>> +static int kcov_mmap(struct file *filep, struct vm_area_struct *vma)
>>> +{
>>> +       int res = 0;
>>> +       void *area;
>>> +       struct kcov *kcov = vma->vm_file->private_data;
>>> +
>>> +       /* Can't call vmalloc_user() under a spinlock. */
>>> +       area = vmalloc_user(vma->vm_end - vma->vm_start);
>>> +       if (!area)
>>> +               return -ENOMEM;
>>> +
>>> +       spin_lock(&kcov->lock);
>>> +       if (kcov->mode == 0 || vma->vm_pgoff != 0 ||
>>> +           vma->vm_end - vma->vm_start != kcov->size * sizeof(u32)) {
>>> +               res = -EINVAL;
>>> +               goto exit;
>>> +       }
>>> +       if (!kcov->area) {
>>> +               kcov->area = area;
>>> +               area = NULL;
>>> +       }
>>> +       /* The file drops a reference on close, but the file
>>> +        * descriptor can be closed with the mmaping still alive so we keep
>>> +        * a reference for those.  This is put in kcov_unmap().
>>> +        */
>>> +       atomic_inc(&kcov->rc);
>>> +       vma->vm_ops = &kcov_vm_ops;
>>> +exit:
>>> +       spin_unlock(&kcov->lock);
>>> +       vfree(area);
>>> +       return res;
>>> +}
>>> +
>>> +static int kcov_open(struct inode *inode, struct file *filep)
>>> +{
>>> +       struct kcov *kcov;
>>> +
>>> +       kcov = kzalloc(sizeof(*kcov), GFP_KERNEL);
>>> +       if (!kcov)
>>> +               return -ENOMEM;
>>> +       atomic_set(&kcov->rc, 1);
>>> +       spin_lock_init(&kcov->lock);
>>> +       filep->private_data = kcov;
>>> +       return nonseekable_open(inode, filep);
>>> +}
>>> +
>>> +static int kcov_close(struct inode *inode, struct file *filep)
>>> +{
>>> +       kcov_put(filep->private_data);
>>> +       return 0;
>>> +}
>>> +
>>> +static int kcov_ioctl_locked(struct kcov *kcov, unsigned int cmd,
>>> +                            unsigned long arg)
>>> +{
>>> +       struct task_struct *t;
>>> +
>>> +       switch (cmd) {
>>> +       case KCOV_INIT_TRACE:
>>> +               /* Enable kcov in trace mode and setup buffer size.
>>> +                * Must happen before anything else.
>>> +                */
>>> +               if (arg < 256 || arg > (128<<20) || arg & (arg - 1))
>>> +                       return -EINVAL;
>>> +               if (kcov->mode != 0)
>>> +                       return -EBUSY;
>>> +               kcov->mode = kcov_mode_trace;
>>> +               kcov->size = arg;
>>> +               return 0;
>>> +       case KCOV_ENABLE:
>>> +               /* Enable coverage for the current task.
>>> +                * At this point user must have been enabled trace mode,
>>> +                * and mmapped the file. Coverage collection is disabled only
>>> +                * at task exit or voluntary by KCOV_DISABLE. After that it can
>>> +                * be enabled for another task.
>>> +                */
>>> +               if (kcov->mode == 0 || kcov->area == NULL)
>>> +                       return -EINVAL;
>>> +               if (kcov->t != NULL)
>>> +                       return -EBUSY;
>>> +               t = current;
>>> +               /* Cache in task struct for performance. */
>>> +               t->kcov_size = kcov->size;
>>> +               t->kcov_area = kcov->area;
>>> +               /* See comment in __sanitizer_cov_trace_pc(). */
>>> +               barrier();
>>> +               WRITE_ONCE(t->kcov_mode, kcov->mode);
>>> +               t->kcov = kcov;
>>> +               kcov->t = t;
>>> +               /* This is put either in kcov_task_exit() or in KCOV_DISABLE. */
>>> +               atomic_inc(&kcov->rc);
>>> +               return 0;
>>> +       case KCOV_DISABLE:
>>> +               /* Disable coverage for the current task. */
>>> +               if (current->kcov != kcov)
>>> +                       return -EINVAL;
>>> +               t = current;
>>> +               BUG_ON(kcov->t != t);
>>> +               kcov_task_init(t);
>>> +               kcov->t = NULL;
>>> +               BUG_ON(atomic_dec_and_test(&kcov->rc));
>>> +               return 0;
>>> +       default:
>>> +               return -EINVAL;
>>> +       }
>>> +}
>>> +
>>> +static long kcov_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>>> +{
>>> +       struct kcov *kcov;
>>> +       int res;
>>> +
>>> +       kcov = filep->private_data;
>>> +       spin_lock(&kcov->lock);
>>> +       res = kcov_ioctl_locked(kcov, cmd, arg);
>>> +       spin_unlock(&kcov->lock);
>>> +       return res;
>>> +}
>>> +
>>> +static const struct file_operations kcov_fops = {
>>> +       .open           = kcov_open,
>>> +       .unlocked_ioctl = kcov_ioctl,
>>> +       .mmap           = kcov_mmap,
>>> +       .release        = kcov_close,
>>> +};
>>> +
>>> +static int __init kcov_init(void)
>>> +{
>>> +       if (!debugfs_create_file("kcov", 0666, NULL, NULL, &kcov_fops)) {
>>> +               pr_err("init failed\n");
>>> +               return 1;
>>> +       }
>>> +       return 0;
>>> +}
>>> +
>>> +device_initcall(kcov_init);
>>> diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
>>> index 8e96f6c..f816de9 100644
>>> --- a/kernel/locking/Makefile
>>> +++ b/kernel/locking/Makefile
>>> @@ -1,3 +1,6 @@
>>> +# Any varying coverage in these files is non-deterministic
>>> +# and is generally not a function of system call inputs.
>>> +KCOV_INSTRUMENT := n
>>>
>>>  obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
>>>
>>> diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
>>> index 61a1656..032b2c0 100644
>>> --- a/kernel/rcu/Makefile
>>> +++ b/kernel/rcu/Makefile
>>> @@ -1,3 +1,7 @@
>>> +# Any varying coverage in these files is non-deterministic
>>> +# and is generally not a function of system call inputs.
>>> +KCOV_INSTRUMENT := n
>>> +
>>>  obj-y += update.o sync.o
>>>  obj-$(CONFIG_SRCU) += srcu.o
>>>  obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
>>> diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
>>> index 6768797..f0a9265 100644
>>> --- a/kernel/sched/Makefile
>>> +++ b/kernel/sched/Makefile
>>> @@ -2,6 +2,10 @@ ifdef CONFIG_FUNCTION_TRACER
>>>  CFLAGS_REMOVE_clock.o = $(CC_FLAGS_FTRACE)
>>>  endif
>>>
>>> +# These files are disabled because they produce non-interesting flaky coverage
>>> +# that is not a function of syscall inputs. E.g. involuntary context switches.
>>> +KCOV_INSTRUMENT := n
>>> +
>>>  ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
>>>  # According to Alan Modra <alan@...uxcare.com.au>, the -fno-omit-frame-pointer is
>>>  # needed for x86 only.  Why this used to be enabled for all architectures is beyond
>>> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
>>> index c98e93c..cb71e25 100644
>>> --- a/lib/Kconfig.debug
>>> +++ b/lib/Kconfig.debug
>>> @@ -670,6 +670,33 @@ config DEBUG_STACKOVERFLOW
>>>
>>>           If in doubt, say "N".
>>>
>>> +config ARCH_HAS_KCOV
>>> +       bool
>>> +
>>> +if ARCH_HAS_KCOV

Ah! I see now it now, I'd missed this if/endif section. I think it
would be more readable to enhance the "config ARCH_HAS_KCOV" to have a
help section that describes what an architecture needs to do to
support KCOV (in this case, "test it at all"), and then instead of the
if/endif wrapping, add it to the "depends" line:

>>> +
>>> +config KCOV
>>> +       bool "Code coverage for fuzzing"
>>> +       depends on !RANDOMIZE_BASE

e.g.:

    depends on !RANDOMIZE_BASE && ARCH_HAS_KCOV

>>> +       default n
>>
>> Minor nit: "default n" is redundant.
>
> Will address this in a next version.
>
>
>>> +       help
>>> +         KCOV exposes kernel code coverage information in a form suitable
>>> +         for coverage-guided fuzzing (randomized testing).
>>> +
>>> +         RANDOMIZE_BASE is not supported. KCOV exposes PC values that are meant
>>> +         to be stable on different machines and across reboots. RANDOMIZE_BASE
>>> +         breaks this assumption. Potentially it can be supported by subtracting
>>> +         _stext from [_stext, _send), but it is more tricky (and slow) for
>>> +         modules.
>>
>> In the future, it'd be nice if the kASLR conflict were run-time
>> selectable instead of build-time selectable (as done for hibernation).
>
> I think in the future we will just support KASLR one way or another.
> It is required for Android. It will slowdown coverage a bit, but that
> code will be under #ifdef CONFIG_RANDOMIZE_BASE.

Sounds good!

>
>
>
>>> +
>>> +         KCOV does not have any arch-specific code, but currently it is enabled
>>> +         only for x86_64. KCOV requires testing on other archs, and most likely
>>> +         disabling of instrumentation for some early boot code.
>>
>> I don't see where this is enforced. Should this say "is tested only on
>> x86_64" instead of "enabled"?
>
> It is enforced with ARCH_HAS_KCOV.

Thanks!

-Kees

>
>
> Thanks for the review!
>
>>> +
>>> +         For more details, see Documentation/kcov.txt.
>>> +
>>> +endif
>>> +
>>>  source "lib/Kconfig.kmemcheck"
>>>
>>>  source "lib/Kconfig.kasan"
>>> diff --git a/lib/Makefile b/lib/Makefile
>>> index 7f1de26..bfcc12e 100644
>>> --- a/lib/Makefile
>>> +++ b/lib/Makefile
>>> @@ -7,6 +7,15 @@ ORIG_CFLAGS := $(KBUILD_CFLAGS)
>>>  KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS))
>>>  endif
>>>
>>> +# These files are disabled because they produce lots of non-interesting and/or
>>> +# flaky coverage that is not a function of syscall inputs. For example,
>>> +# rbtree can be global and individual rotations don't correlate with inputs.
>>> +KCOV_INSTRUMENT_string.o := n
>>> +KCOV_INSTRUMENT_rbtree.o := n
>>> +KCOV_INSTRUMENT_list_debug.o := n
>>> +KCOV_INSTRUMENT_debugobjects.o := n
>>> +KCOV_INSTRUMENT_dynamic_debug.o := n
>>> +
>>>  lib-y := ctype.o string.o vsprintf.o cmdline.o \
>>>          rbtree.o radix-tree.o dump_stack.o timerqueue.o\
>>>          idr.o int_sqrt.o extable.o \
>>> diff --git a/mm/Makefile b/mm/Makefile
>>> index 2ed4319..cf751bb 100644
>>> --- a/mm/Makefile
>>> +++ b/mm/Makefile
>>> @@ -5,6 +5,21 @@
>>>  KASAN_SANITIZE_slab_common.o := n
>>>  KASAN_SANITIZE_slub.o := n
>>>
>>> +# These files are disabled because they produce non-interesting and/or
>>> +# flaky coverage that is not a function of syscall inputs. E.g. slab is out of
>>> +# free pages, or a task is migrated between nodes.
>>> +KCOV_INSTRUMENT_slab_common.o := n
>>> +KCOV_INSTRUMENT_slob.o := n
>>> +KCOV_INSTRUMENT_slab.o := n
>>> +KCOV_INSTRUMENT_slub.o := n
>>> +KCOV_INSTRUMENT_page_alloc.o := n
>>> +KCOV_INSTRUMENT_debug-pagealloc.o := n
>>> +KCOV_INSTRUMENT_kmemleak.o := n
>>> +KCOV_INSTRUMENT_kmemcheck.o := n
>>> +KCOV_INSTRUMENT_memcontrol.o := n
>>> +KCOV_INSTRUMENT_mmzone.o := n
>>> +KCOV_INSTRUMENT_vmstat.o := n
>>> +
>>>  mmu-y                  := nommu.o
>>>  mmu-$(CONFIG_MMU)      := gup.o highmem.o memory.o mincore.o \
>>>                            mlock.o mmap.o mprotect.o mremap.o msync.o rmap.o \
>>> diff --git a/mm/kasan/Makefile b/mm/kasan/Makefile
>>> index 6471014..ad97f0b 100644
>>> --- a/mm/kasan/Makefile
>>> +++ b/mm/kasan/Makefile
>>> @@ -1,4 +1,5 @@
>>>  KASAN_SANITIZE := n
>>> +KCOV_INSTRUMENT := n
>>>
>>>  CFLAGS_REMOVE_kasan.o = -pg
>>>  # Function splitter causes unnecessary splits in __asan_load1/__asan_store1
>>> diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
>>> index 79e8661..ebf6f1b 100644
>>> --- a/scripts/Makefile.lib
>>> +++ b/scripts/Makefile.lib
>>> @@ -129,6 +129,12 @@ _c_flags += $(if $(patsubst n%,, \
>>>                 $(CFLAGS_KASAN))
>>>  endif
>>>
>>> +ifeq ($(CONFIG_KCOV),y)
>>> +_c_flags += $(if $(patsubst n%,, \
>>> +       $(KCOV_INSTRUMENT_$(basetarget).o)$(KCOV_INSTRUMENT)y), \
>>> +       $(CFLAGS_KCOV))
>>> +endif
>>> +
>>>  # If building the kernel in a separate objtree expand all occurrences
>>>  # of -Idir to -I$(srctree)/dir except for absolute paths (starting with '/').
>>>
>>> --
>>> 2.6.0.rc2.230.g3dd15c0
>>>
>>
>> Very cool! :)
>>
>> -Kees
>>
>> --
>> Kees Cook
>> Chrome OS & Brillo Security



-- 
Kees Cook
Chrome OS & Brillo Security

Powered by blists - more mailing lists