lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Sun, 29 Aug 2021 18:29:53 +0900
From:   Masami Hiramatsu <mhiramat@...nel.org>
To:     wuqiang <wuqiang.matt@...edance.com>
Cc:     naveen.n.rao@...ux.ibm.com, anil.s.keshavamurthy@...el.com,
        davem@...emloft.net, mingo@...nel.org, peterz@...radead.org,
        linux-kernel@...r.kernel.org, mattwu@....com
Subject: Re: [PATCH 0/2] *** kretprobe scalability improvement ***

Hi,

On Sun,  8 Aug 2021 02:54:15 +0800
wuqiang <wuqiang.matt@...edance.com> wrote:

> kretprobe is using freelist to manage return instances, but freelist as
> a LIFO queue based on singly linked list, scales badly and thus lowers
> throughput of kretprobed routines, especially for high parallelization.
> Here's a typical result (XEON 8260: 2 sockets/48 cores/96 threads):
> 
>       1X       2X       4X       6X       8X      12X     16X
> 10880312 18121228 23214783 13155457 11190217 10991228 9623992
>      24X      32X      48X      64X      96X     128X    192X
>  8484455  8376786  6766684  5698349  4113405  4528009 4081401
> 
> This patch implements a scalabe, lock-less and numa-aware object pool
> and as a result improves kretprobe to achieve near-linear scalability.
> Tests of kretprobe throughput show the biggest gain as 181.5x of the
> original freelist. Tge extreme tests of raw queue throughput can be up
> to 282.8 of gain. The comparison results are the followings:
> 
>                   1X         2X         4X         8X        16X
> freelist:  237911411  163596418   33048459   15506757   10640043
> objpool:   234799081  294086132  585290693 1164205947 2334923746
>                  24X        32X        48X        64X        96X
> freelist:    9025299    7965531    6800225    5507639    4284752
> objpool:  3508905695 1106760339 1101385147 1221763856 1211654038
> 
> The object pool is a percpu-extended version of original freelist,
> with compact memory footprints and balanced performance results for
> 3 test caess: nonblockable retrieval (most kertprobe cases), bulk
> retrieval in a row (multiple-threaded blockable kretprobe), huge
> misses (preallocated objects much less than required).

Sorry, I missed this series.
I'm OK for the code, but please combine these 2 patches into 1 because
those are not bisectable.

Thank you,

> 
> wuqiang (2):
>   scalable lock-less object pool implementation
>   kretprobe: manage instances with scalable object pool
> 
>  include/linux/freelist.h | 521 ++++++++++++++++++++++++++++++++++++---
>  include/linux/kprobes.h  |   2 +-
>  kernel/kprobes.c         |  83 ++++---
>  3 files changed, 536 insertions(+), 70 deletions(-)
> 
> -- 
> 2.25.1
> 


-- 
Masami Hiramatsu <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ