lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 12 Dec 2018 17:47:55 +0000
From:   Edward Cree <ecree@...arflare.com>
To:     Nadav Amit <namit@...are.com>, Josh Poimboeuf <jpoimboe@...hat.com>
CC:     <linux-kernel@...r.kernel.org>, <x86@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>
Subject: [RFC/WIP PATCH 0/2] dynamic calls

A fix to the static_calls series (on which this series depends), and a really
 hacky proof-of-concept of runtime-patched branch trees of static_calls to
 avoid indirect calls / retpolines in the hot-path.  Rather than any generally
 applicable machinery, the patch just open-codes it for one call site (the
 pt_prev->func() call in deliver_skb and __netif_receive_skb_one_core()); it
 should however be possible to make a macro that takes a 'name' parameter and
 expands to the whole thing.  Also the _update() function could be shared and
 get something useful from its work_struct, rather than needing a separate
 copy of the function for every indirect call site.

Performance testing so far has been somewhat inconclusive; I applied this on
 net-next, hacked up my Kconfig to use out-of-line static calls on x86-64, and
 ran some 1-byte UDP stream tests with the DUT receiving.
On a single stream test, I saw packet rate go up by 7%, from 470Kpps to
 504Kpps, with a considerable reduction in variance; however, CPU usage
 increased by a larger factor: (packet rate / RX cpu) is a much lower-variance
 measurement and went down by 13%.  This however may be because it often got
 into a state where, while patching the calls (and thus sending all callers
 down the slow path) we continue to gather stats and see enough calls to
 trigger another update; as there's no code to detect and skip an update that
 doesn't change anything, we get into a tight loop of redoing updates.  I am
 working on this & plan to change it to not collect any stats while an update
 is actually in progress.
On a 4-stream test, the variance I saw was too high to draw any conclusions;
 the packet rate went down about 2½% but this was not statistically
 significant (and the fastest run I saw was with dynamic calls present).

Edward Cree (2):
  static_call: fix out-of-line static call implementation
  net: core: rather hacky PoC implementation of dynamic calls

 include/linux/static_call.h |   6 +-
 net/core/dev.c              | 222 +++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 221 insertions(+), 7 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ