linux-kernel - A new spinlock for multicore (>16) platform

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOpXEOwh6r0Fo_5hgXET6gGHTUxTdE1aDkp_K0ga8BwGSDMX+A@mail.gmail.com>
Date: Mon, 15 Jul 2024 01:07:40 +0800
From: Shi-Wu, Lo（Gmail） <shiwulo@...il.com>
To: linux-kernel@...r.kernel.org
Subject: A new spinlock for multicore (>16) platform

Dear Linux Contributors,
I am a Linux enthusiast from Taiwan, and I hope to contribute to the
Linux kernel. We have developed a new spinlock method that has been
validated on AMD 64-core and AMD 32-core processors. Compared to
previous methods, this new method is optimized in the following areas:

Motivation and Approaches:
1. As the number of cores increases, there is a need for more refined
optimization of the data transmission paths between cores.
2. Data transmission usually involves lock-unlock wrapping.
3. Performance improvement can be achieved using a shortest path
approximation algorithm.
   A detailed introduction to this method can be found in the following paper:
https://www.usenix.org/conference/osdi23/presentation/lo

Our laboratory is currently developing a system that can apply the
same optimization strategy to all multi-core processors. Below is our
plan.

The New Method and Its Compatibility with qspinlock:
1. The default algorithm in the Linux kernel remains qspinlock.
2. A new file is created in /proc/routing_path, where a shortest path
can be input, for example:
sudo echo 1,2,3,4,16,17,18,19,5,6,7,8,11,12,13,14 > /proc/routing_path
3. After inputting the shortest path, the kernel switches to using the
RON algorithm.

Expected Outcomes:
According to our measurements on AMD 32-core and AMD 64-core
processors, Google LevelDB can achieve a 3-4% speed improvement.

Comparison with Previous NUMA-aware algorithms:
Compared to NUMA-aware results, since such systems may contain more
than two processors, the communication cost between processors is much
higher than the communication cost between cores (within the same
processor). Our method focuses on multiple cores within a single
processor, making it multicore-aware. If a NUMA-aware algorithm is
used in a multicore environment, it is not as effective as a
multicore-aware algorithm. (Please refer to the paper,
https://www.usenix.org/conference/osdi23/presentation/lo)

Assistance Needed:
I would like to understand if the Linux kernel community is interested
in this new spinlock method. As a teacher, I cannot complete all the
work by myself. Is anyone willing to collaborate with me on this
project?

Sorry to bother you:
I apologize for taking up so much of your time with this letter.
Although I am quite old, this is the first time I feel that my
research results are good enough to contribute to the Linux community.
I have read the relevant documentation, and it made me realize that my
time and abilities are insufficient to write the high-quality code
required by the Linux community. Therefore, I ask for your guidance.

All the best to you all

shiwu