[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c9505525-54d9-4610-a47a-5f8d2d3f8de6@gmail.com>
Date: Fri, 5 Jan 2024 04:02:38 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: Xiaobing Li <xiaobing.li@...sung.com>, axboe@...nel.dk
Cc: linux-kernel@...r.kernel.org, io-uring@...r.kernel.org,
kun.dou@...sung.com, peiwei.li@...sung.com, joshi.k@...sung.com,
kundan.kumar@...sung.com, wenwen.chen@...sung.com, ruyi.zhang@...sung.com
Subject: Re: [PATCH v6] io_uring: Statistics of the true utilization of sq
threads.
On 1/3/24 05:49, Xiaobing Li wrote:
> On 12/30/23 9:27 AM, Pavel Begunkov wrote:
>> Why it uses jiffies instead of some task run time?
>> Consequently, why it's fine to account irq time and other
>> preemption? (hint, it's not)
>>
>> Why it can't be done with userspace and/or bpf? Why
>> can't it be estimated by checking and tracking
>> IORING_SQ_NEED_WAKEUP in userspace?
>>
>> What's the use case in particular? Considering that
>> one of the previous revisions was uapi-less, something
>> is really fishy here. Again, it's a procfs file nobody
>> but a few would want to parse to use the feature.
>>
>> Why it just keeps aggregating stats for the whole
>> life time of the ring? If the workload changes,
>> that would either totally screw the stats or would make
>> it too inert to be useful. That's especially relevant
>> for long running (days) processes. There should be a
>> way to reset it so it starts counting anew.
>
> Hi, Jens and Pavel,
> I carefully read the questions you raised.
> First of all, as to why I use jiffies to statistics time, it
> is because I have done some performance tests and found that
> using jiffies has a relatively smaller loss of performance
> than using task run time. Of course, using task run time is
How does taking a measure for task runtime looks like? I expect it to
be a simple read of a variable inside task_struct, maybe with READ_ONCE,
in which case the overhead shouldn't be realistically measurable. Does
it need locking?
> indeed more accurate. But in fact, our requirements for
> accuracy are not particularly high, so after comprehensive
I'm looking at it as a generic feature for everyone, and the
accuracy behaviour is dependent on circumstances. High load
networking spends quite a good share of CPU in softirq, and
preemption would be dependent on config, scheduling, pinning,
etc.
> consideration, we finally chose to use jiffies.
> Of course, if you think that a little more performance loss
> here has no impact, I can use task run time instead, but in
> this case, does the way of calculating sqpoll thread timeout
> also need to be changed, because it is also calculated through
> jiffies.
That's a good point. It doesn't have to change unless you're
directly inferring the idle time parameter from those two
time values rather than using the ratio. E.g. a simple
bisection of the idle time based on the utilisation metric
shouldn't change. But that definitely raises the question
what idle_time parameter should exactly mean, and what is
more convenient for algorithms.
> Then there’s how to use this metric.
> We are studying some optimization methods for io-uring, including
> performance and CPU utilization, but we found that there is
> currently no tool that can observe the CPU ratio of sqthread's
> actual processing IO part, so we want to merge this method that
> can observe this value so that we can more easily observe the
> optimization effects.
--
Pavel Begunkov
Powered by blists - more mailing lists