[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250530154250.15caab4e3991de779aabe02c@linux-foundation.org>
Date: Fri, 30 May 2025 15:42:50 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Bo Li <libo.gcs85@...edance.com>
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, luto@...nel.org,
kees@...nel.org, david@...hat.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, peterz@...radead.org, dietmar.eggemann@....com,
hpa@...or.com, acme@...nel.org, namhyung@...nel.org, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org, irogers@...gle.com,
adrian.hunter@...el.com, kan.liang@...ux.intel.com,
viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz,
lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, vbabka@...e.cz,
rppt@...nel.org, surenb@...gle.com, mhocko@...e.com, rostedt@...dmis.org,
bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com, jannh@...gle.com,
pfalcato@...e.de, riel@...riel.com, harry.yoo@...cle.com,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
duanxiongchun@...edance.com, yinhongbo@...edance.com,
dengliang.1214@...edance.com, xieyongji@...edance.com,
chaiwen.cc@...edance.com, songmuchun@...edance.com, yuanzhu@...edance.com,
chengguozhu@...edance.com, sunjiadong.lff@...edance.com
Subject: Re: [RFC v2 00/35] optimize cost of inter-process communication
On Fri, 30 May 2025 17:27:28 +0800 Bo Li <libo.gcs85@...edance.com> wrote:
> During testing, the client transmitted 1 million 32-byte messages, and we
> computed the per-message average latency. The results are as follows:
>
> *****************
> Without RPAL: Message length: 32 bytes, Total TSC cycles: 19616222534,
> Message count: 1000000, Average latency: 19616 cycles
> With RPAL: Message length: 32 bytes, Total TSC cycles: 1703459326,
> Message count: 1000000, Average latency: 1703 cycles
> *****************
>
> These results confirm that RPAL delivers substantial latency improvements
> over the current epoll implementation—achieving a 17,913-cycle reduction
> (an ~91.3% improvement) for 32-byte messages.
Noted ;)
Quick question:
> arch/x86/Kbuild | 2 +
> arch/x86/Kconfig | 2 +
> arch/x86/entry/entry_64.S | 160 ++
> arch/x86/events/amd/core.c | 14 +
> arch/x86/include/asm/pgtable.h | 25 +
> arch/x86/include/asm/pgtable_types.h | 11 +
> arch/x86/include/asm/tlbflush.h | 10 +
> arch/x86/kernel/asm-offsets.c | 3 +
> arch/x86/kernel/cpu/common.c | 8 +-
> arch/x86/kernel/fpu/core.c | 8 +-
> arch/x86/kernel/nmi.c | 20 +
> arch/x86/kernel/process.c | 25 +-
> arch/x86/kernel/process_64.c | 118 +
> arch/x86/mm/fault.c | 271 ++
> arch/x86/mm/mmap.c | 10 +
> arch/x86/mm/tlb.c | 172 ++
> arch/x86/rpal/Kconfig | 21 +
> arch/x86/rpal/Makefile | 6 +
> arch/x86/rpal/core.c | 477 ++++
> arch/x86/rpal/internal.h | 69 +
> arch/x86/rpal/mm.c | 426 +++
> arch/x86/rpal/pku.c | 196 ++
> arch/x86/rpal/proc.c | 279 ++
> arch/x86/rpal/service.c | 776 ++++++
> arch/x86/rpal/thread.c | 313 +++
The changes are very x86-heavy. Is that a necessary thing? Would
another architecture need to implement a similar amount to enable RPAL?
IOW, how much of the above could be made arch-neutral?
Powered by blists - more mailing lists