[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20210829182953.1a05f40bdc5c82ff3a997d69@kernel.org>
Date: Sun, 29 Aug 2021 18:29:53 +0900
From: Masami Hiramatsu <mhiramat@...nel.org>
To: wuqiang <wuqiang.matt@...edance.com>
Cc: naveen.n.rao@...ux.ibm.com, anil.s.keshavamurthy@...el.com,
davem@...emloft.net, mingo@...nel.org, peterz@...radead.org,
linux-kernel@...r.kernel.org, mattwu@....com
Subject: Re: [PATCH 0/2] *** kretprobe scalability improvement ***
Hi,
On Sun, 8 Aug 2021 02:54:15 +0800
wuqiang <wuqiang.matt@...edance.com> wrote:
> kretprobe is using freelist to manage return instances, but freelist as
> a LIFO queue based on singly linked list, scales badly and thus lowers
> throughput of kretprobed routines, especially for high parallelization.
> Here's a typical result (XEON 8260: 2 sockets/48 cores/96 threads):
>
> 1X 2X 4X 6X 8X 12X 16X
> 10880312 18121228 23214783 13155457 11190217 10991228 9623992
> 24X 32X 48X 64X 96X 128X 192X
> 8484455 8376786 6766684 5698349 4113405 4528009 4081401
>
> This patch implements a scalabe, lock-less and numa-aware object pool
> and as a result improves kretprobe to achieve near-linear scalability.
> Tests of kretprobe throughput show the biggest gain as 181.5x of the
> original freelist. Tge extreme tests of raw queue throughput can be up
> to 282.8 of gain. The comparison results are the followings:
>
> 1X 2X 4X 8X 16X
> freelist: 237911411 163596418 33048459 15506757 10640043
> objpool: 234799081 294086132 585290693 1164205947 2334923746
> 24X 32X 48X 64X 96X
> freelist: 9025299 7965531 6800225 5507639 4284752
> objpool: 3508905695 1106760339 1101385147 1221763856 1211654038
>
> The object pool is a percpu-extended version of original freelist,
> with compact memory footprints and balanced performance results for
> 3 test caess: nonblockable retrieval (most kertprobe cases), bulk
> retrieval in a row (multiple-threaded blockable kretprobe), huge
> misses (preallocated objects much less than required).
Sorry, I missed this series.
I'm OK for the code, but please combine these 2 patches into 1 because
those are not bisectable.
Thank you,
>
> wuqiang (2):
> scalable lock-less object pool implementation
> kretprobe: manage instances with scalable object pool
>
> include/linux/freelist.h | 521 ++++++++++++++++++++++++++++++++++++---
> include/linux/kprobes.h | 2 +-
> kernel/kprobes.c | 83 ++++---
> 3 files changed, 536 insertions(+), 70 deletions(-)
>
> --
> 2.25.1
>
--
Masami Hiramatsu <mhiramat@...nel.org>
Powered by blists - more mailing lists