linux-kernel - [PATCH 0/3] Sq thread real utilization statistics.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230928022228.15770-1-xiaobing.li@samsung.com>
Date:   Thu, 28 Sep 2023 10:22:25 +0800
From:   Xiaobing Li <xiaobing.li@...sung.com>
To:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com, axboe@...nel.dk,
        asml.silence@...il.com
Cc:     linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        io-uring@...r.kernel.org, kun.dou@...sung.com,
        peiwei.li@...sung.com, joshi.k@...sung.com,
        kundan.kumar@...sung.com, wenwen.chen@...sung.com,
        ruyi.zhang@...sung.com, Xiaobing Li <xiaobing.li@...sung.com>
Subject: [PATCH 0/3] Sq thread real utilization statistics.

Summary:

The current kernel's pelt scheduling algorithm is calculated based on
the running time of the thread. However, this algorithm may cause a
waste of CPU resources for some threads, such as the sq thread in
io_uring.
Since the sq thread has a while(1) structure, during this process, there
may be a lot of time when IO is not processed but the timeout period is
not exceeded, so the sqpoll thread will keep running, thus occupying the
CPU. Obviously, the CPU is wasted at this time.
our goal is to count the part of the time the sqpoll thread actually
processes IO, thereby reflecting the part of its CPU used to process IO,
which can be used to help improve the actual utilization of the CPU in
the future.
Modifications to the scheduling module are also applicable to other
threads with the same needs.

We use fio (version 3.28) to test the performance. In the experiments,
an fio process are viewed as an application, it starts job with sq_poll
enabled. The tests are performed on a host with 256 CPUs and 64G memory,
the IO tasks are performed on a PM1743 SSD, and the OS is Ubuntu 22.04
with kernel version of 6.4.0.

Some parameters for sequential reading and writing are as follows:
bs=128k, numjobs=1, iodepth=64.
Some parameters for random reading and writing are as follows:
bs=4k, numjobs=16, iodepth=64.

The test results are as follows:
Before modification
         read   write   randread   randwrite
IOPS(K)  53.7   46.1    849        293
BW(MB/S) 7033   6037    3476       1199

After modification
         read   write   randread   randwrite
IOPS(K)  53.7   46.1    847        293
BW(MB/S) 7033   6042    3471       1199

It can be seen from the test results that my modifications have almost
no impact on performance.

Xiaobing Li (3):
  SCHEDULER: Add an interface for counting real utilization.
  PROC FILESYSTEM: Add real utilization data of sq thread.
  IO_URING: Statistics of the true utilization of sq threads.

 fs/proc/stat.c              | 25 ++++++++++++++++++++++++-
 include/linux/kernel.h      |  7 ++++++-
 include/linux/kernel_stat.h |  3 +++
 include/linux/sched.h       |  1 +
 io_uring/sqpoll.c           | 26 +++++++++++++++++++++++++-
 kernel/sched/cputime.c      | 36 +++++++++++++++++++++++++++++++++++-
 kernel/sched/pelt.c         | 14 ++++++++++++++
 7 files changed, 108 insertions(+), 4 deletions(-)

-- 
2.34.1