lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230928022228.15770-1-xiaobing.li@samsung.com>
Date:   Thu, 28 Sep 2023 10:22:25 +0800
From:   Xiaobing Li <xiaobing.li@...sung.com>
To:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com, axboe@...nel.dk,
        asml.silence@...il.com
Cc:     linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        io-uring@...r.kernel.org, kun.dou@...sung.com,
        peiwei.li@...sung.com, joshi.k@...sung.com,
        kundan.kumar@...sung.com, wenwen.chen@...sung.com,
        ruyi.zhang@...sung.com, Xiaobing Li <xiaobing.li@...sung.com>
Subject: [PATCH 0/3] Sq thread real utilization statistics.

Summary:

The current kernel's pelt scheduling algorithm is calculated based on
the running time of the thread. However, this algorithm may cause a
waste of CPU resources for some threads, such as the sq thread in
io_uring.
Since the sq thread has a while(1) structure, during this process, there
may be a lot of time when IO is not processed but the timeout period is
not exceeded, so the sqpoll thread will keep running, thus occupying the
CPU. Obviously, the CPU is wasted at this time.
our goal is to count the part of the time the sqpoll thread actually
processes IO, thereby reflecting the part of its CPU used to process IO,
which can be used to help improve the actual utilization of the CPU in
the future.
Modifications to the scheduling module are also applicable to other
threads with the same needs.

We use fio (version 3.28) to test the performance. In the experiments,
an fio process are viewed as an application, it starts job with sq_poll
enabled. The tests are performed on a host with 256 CPUs and 64G memory,
the IO tasks are performed on a PM1743 SSD, and the OS is Ubuntu 22.04
with kernel version of 6.4.0.

Some parameters for sequential reading and writing are as follows:
bs=128k, numjobs=1, iodepth=64.
Some parameters for random reading and writing are as follows:
bs=4k, numjobs=16, iodepth=64.

The test results are as follows:
Before modification
         read   write   randread   randwrite
IOPS(K)  53.7   46.1    849        293
BW(MB/S) 7033   6037    3476       1199

After modification
         read   write   randread   randwrite
IOPS(K)  53.7   46.1    847        293
BW(MB/S) 7033   6042    3471       1199

It can be seen from the test results that my modifications have almost
no impact on performance.

Xiaobing Li (3):
  SCHEDULER: Add an interface for counting real utilization.
  PROC FILESYSTEM: Add real utilization data of sq thread.
  IO_URING: Statistics of the true utilization of sq threads.

 fs/proc/stat.c              | 25 ++++++++++++++++++++++++-
 include/linux/kernel.h      |  7 ++++++-
 include/linux/kernel_stat.h |  3 +++
 include/linux/sched.h       |  1 +
 io_uring/sqpoll.c           | 26 +++++++++++++++++++++++++-
 kernel/sched/cputime.c      | 36 +++++++++++++++++++++++++++++++++++-
 kernel/sched/pelt.c         | 14 ++++++++++++++
 7 files changed, 108 insertions(+), 4 deletions(-)

-- 
2.34.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ