linux-kernel - Re: [PATCH v6] io_uring: Statistics of the true utilization of sq threads.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date: Fri, 5 Jan 2024 04:02:38 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: Xiaobing Li <xiaobing.li@...sung.com>, axboe@...nel.dk
Cc: linux-kernel@...r.kernel.org, io-uring@...r.kernel.org,
 kun.dou@...sung.com, peiwei.li@...sung.com, joshi.k@...sung.com,
 kundan.kumar@...sung.com, wenwen.chen@...sung.com, ruyi.zhang@...sung.com
Subject: Re: [PATCH v6] io_uring: Statistics of the true utilization of sq
 threads.

On 1/3/24 05:49, Xiaobing Li wrote:
> On 12/30/23 9:27 AM, Pavel Begunkov wrote:
>> Why it uses jiffies instead of some task run time?
>> Consequently, why it's fine to account irq time and other
>> preemption? (hint, it's not)
>>
>> Why it can't be done with userspace and/or bpf? Why
>> can't it be estimated by checking and tracking
>> IORING_SQ_NEED_WAKEUP in userspace?
>>
>> What's the use case in particular? Considering that
>> one of the previous revisions was uapi-less, something
>> is really fishy here. Again, it's a procfs file nobody
>> but a few would want to parse to use the feature.
>>
>> Why it just keeps aggregating stats for the whole
>> life time of the ring? If the workload changes,
>> that would either totally screw the stats or would make
>> it too inert to be useful. That's especially relevant
>> for long running (days) processes. There should be a
>> way to reset it so it starts counting anew.
> 
> Hi, Jens and Pavel,
> I carefully read the questions you raised.
> First of all, as to why I use jiffies to statistics time, it
> is because I have done some performance tests and found that
> using jiffies has a relatively smaller loss of performance
> than using task run time. Of course, using task run time is

How does taking a measure for task runtime looks like? I expect it to
be a simple read of a variable inside task_struct, maybe with READ_ONCE,
in which case the overhead shouldn't be realistically measurable. Does
it need locking?

> indeed more accurate.  But in fact, our requirements for
> accuracy are not particularly high, so after comprehensive

I'm looking at it as a generic feature for everyone, and the
accuracy behaviour is dependent on circumstances. High load
networking spends quite a good share of CPU in softirq, and
preemption would be dependent on config, scheduling, pinning,
etc.

> consideration, we finally chose to use jiffies.
> Of course, if you think that a little more performance loss
> here has no impact, I can use task run time instead, but in
> this case, does the way of calculating sqpoll thread timeout
> also need to be changed, because it is also calculated through
> jiffies.

That's a good point. It doesn't have to change unless you're
directly inferring the idle time parameter from those two
time values rather than using the ratio. E.g. a simple
bisection of the idle time based on the utilisation metric
shouldn't change. But that definitely raises the question
what idle_time parameter should exactly mean, and what is
more convenient for algorithms.

> Then there’s how to use this metric.
> We are studying some optimization methods for io-uring, including
> performance and CPU utilization, but we found that there is
> currently no tool that can observe the CPU ratio of sqthread's
> actual processing IO part, so we want to merge this method  that
> can observe this value so that we can more easily observe the
> optimization effects.

-- 
Pavel Begunkov