lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 13 Apr 2022 21:26:43 +0200
From:   andrey.konovalov@...ux.dev
To:     Marco Elver <elver@...gle.com>,
        Alexander Potapenko <glider@...gle.com>,
        Mark Rutland <mark.rutland@....com>
Cc:     Andrey Konovalov <andreyknvl@...il.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Andrey Ryabinin <ryabinin.a.a@...il.com>,
        kasan-dev@...glegroups.com,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Sami Tolvanen <samitolvanen@...gle.com>,
        linux-arm-kernel@...ts.infradead.org,
        Peter Collingbourne <pcc@...gle.com>,
        Evgenii Stepanov <eugenis@...gle.com>,
        Florian Mayer <fmayer@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Andrey Konovalov <andreyknvl@...gle.com>
Subject: [PATCH v3 0/3] kasan, arm64, scs: collect stack traces from Shadow Call Stack

From: Andrey Konovalov <andreyknvl@...gle.com>

Currently, when saving alloc and free stack traces, KASAN uses the normal
stack trace collection routines, which rely on the unwinder.

Instead of invoking the unwinder, collect the stack trace by copying
frames from the Shadow Call Stack. This reduces boot time by ~30% for
all KASAN modes when Shadow Call Stack is enabled. See below for the
details of how the measurements were performed.

Stack staces are collected from the Shadow Call Stack via a new
stack_trace_save_shadow() interface.

Note that the implementation is best-effort and only works in certain
contexts. See patch #3 for details.

---

Changes
=======

v2->v3:
- Limit hardirq and drop SDEI support for performance and simplicity.
- Move stack_trace_save_shadow() implementation back to mm/kasan:
  it's not mature enough to be used as a system-wide stack trace
  collection replacement.
- Clarify -ENOSYS return value from stack_trace_save_shadow().
- Don't rename nr_entries to size in kasan_save_stack().
- Check return value of stack_trace_save_shadow() instead of checking
  CONFIG_HAVE_SHADOW_STACKTRACE in kasan_save_stack().

v1->v2:
- Provide a kernel-wide stack_trace_save_shadow() interface for collecting
  stack traces from shadow stack.
- Use ptrauth_strip_insn_pac() and READ_ONCE_NOCHECK, see the comments.
- Get SCS pointer from x18, as per-task value is meant to save the SCS
  value on CPU switches.
- Collect stack frames from SDEI and IRQ contexts.

Perf
====

To measure performance impact, I used QEMU in full system emulation mode
on an x86-64 host.

As proposed by Mark, I passed no filesystem to QEMU and booted with panic=-1:

qemu-system-aarch64 \
	-machine virt,mte=on -cpu max \
	-m 2G -smp 1 -nographic \
	-kernel ./xbins/Image \
	-append "console=ttyAMA0 earlyprintk=serial panic=-1" \
	-no-shutdown -no-reboot

Just in case, the QEMU version is:

$ qemu-system-aarch64 --version
QEMU emulator version 6.2.94 (v5.2.0-rc3-12124-g81c7ed41a1)
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

Then, I recorded the timestamp of when the "Kernel panic" line was printed
to the kernel log.

The measurements were done on 5 kernel flavors:

master                 (mainline commit a19944809fe99):
master-no-stack-traces (stack trace collection commented out)
master-no-stack-depot  (saving to stack depot commented out)
up-scs-stacks-v3       (collecting stack traces from SCS)
up-scs-stacks-v3-noscs (up-scs-stacks-v3 with __noscs marking)

(The last flavor is included just for the record: it produces an unexpected
 slowdown. The likely reason is that helper functions stop getting inlined.)

All the branches can be found here:

https://github.com/xairy/linux/branches/all

The measurements were performed for Generic and HW_TAGS KASAN modes.

The .configs are here (essentially, defconfig + SCS + KASAN):

Generic KASAN: https://gist.github.com/xairy/d527ad31c0b54898512c92898d62beed
HW_TAGS KASAN: https://gist.github.com/xairy/390e4ef0140de3f4f9a49efe20708d21

The results:

Generic KASAN
-------------

master-no-stack-traces: 8.03
master:                 11.55 (+43.8%)
master-no-stack-depot:  11.53 (+43.5%)
up-scs-stacks-v3:       8.31  (+3.4%)
up-scs-stacks-v3-noscs: 9.11  (+13.4%)

HW_TAGS KASAN
-------------

master-no-stack-traces: 3.31
master:                 5.01 (+51%)
master-no-stack-depot:  4.85 (+47%)
up-scs-stacks-v3:       3.49 (+5.4%)
up-scs-stacks-v3-noscs: 4.27 (+29%)

The deviation for all numbers above is ~0.05.

As can be seen, the up-scs-stacks-v3 flavor results in a significantly
faster boot compared to master.

Andrey Konovalov (3):
  arm64, scs: expose irq_shadow_call_stack_ptr
  kasan, arm64: implement stack_trace_save_shadow
  kasan: use stack_trace_save_shadow

 arch/arm64/include/asm/scs.h | 10 +++++-
 arch/arm64/kernel/irq.c      |  4 +--
 arch/arm64/kernel/sdei.c     |  3 --
 mm/kasan/common.c            | 66 +++++++++++++++++++++++++++++++++++-
 4 files changed, 75 insertions(+), 8 deletions(-)

-- 
2.25.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ