[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dc861c95-3150-03c7-4ecb-d86c53f7d8b3@linux.alibaba.com>
Date: Tue, 1 Mar 2022 11:53:39 +0800
From: Hao Xu <haoxu@...ux.alibaba.com>
To: Olivier Langlois <olivier@...llion01.com>,
Jens Axboe <axboe@...nel.dk>
Cc: Pavel Begunkov <asml.silence@...il.com>,
io-uring <io-uring@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1] io_uring: Add support for napi_busy_poll
在 2022/3/1 上午5:20, Olivier Langlois 写道:
> On Tue, 2022-03-01 at 02:34 +0800, Hao Xu wrote:
>>
>> On 2/25/22 23:32, Olivier Langlois wrote:
>>> On Fri, 2022-02-25 at 00:32 -0500, Olivier Langlois wrote:
>>>>>> +#ifdef CONFIG_NET_RX_BUSY_POLL
>>>>>> +static void io_adjust_busy_loop_timeout(struct timespec64
>>>>>> *ts,
>>>>>> + struct io_wait_queue
>>>>>> *iowq)
>>>>>> +{
>>>>>> + unsigned busy_poll_to =
>>>>>> READ_ONCE(sysctl_net_busy_poll);
>>>>>> + struct timespec64 pollto = ns_to_timespec64(1000 *
>>>>>> (s64)busy_poll_to);
>>>>>> +
>>>>>> + if (timespec64_compare(ts, &pollto) > 0) {
>>>>>> + *ts = timespec64_sub(*ts, pollto);
>>>>>> + iowq->busy_poll_to = busy_poll_to;
>>>>>> + } else {
>>>>>> + iowq->busy_poll_to = timespec64_to_ns(ts) /
>>>>>> 1000;
>>>>> How about timespec64_tons(ts) >> 10, since we don't need
>>>>> accurate
>>>>> number.
>>>> Fantastic suggestion! The kernel test robot did also detect an
>>>> issue
>>>> with that statement. I did discover do_div() in the meantime but
>>>> what
>>>> you suggest is better, IMHO...
>>> After having seen Jens patch (io_uring: don't convert to jiffies
>>> for
>>> waiting on timeouts), I think that I'll stick with do_div().
>>>
>>> I have a hard time considering removing timing accuracy when effort
>>> is
>>> made to make the same function more accurate...
>>
>>
>> I think they are different things. Jens' patch is to resolve the
>> problem
>>
>> that jiffies possibly can not stand for time < 1ms (when HZ is 1000).
>>
>> For example, a user assigns 10us, turn out to be 1ms, it's big
>> difference.
>>
>> But divided by 1000 or 1024 is not that quite different in this case.
>>
>>>
> idk... For every 100uSec slice, dividing by 1024 will introduce a
> ~2.4uSec error. I didn't dig enough the question to figure out if the
> error was smaller than the used clock accuracy.
>
> but even if the error is small, why letting it slip in when 100%
> accurate value is possible?
>
> Beside, making the painfully picky do_div() macro for some platforms
> happy, I fail to understand the problem with doing a division to get an
> accurate value.
>
> let me reverse the question. Even if the bit shifting is a bit faster
> than doing the division, would the code be called often enough to make
> a significant difference?
It's just my personal preference: when a faster way is acceptable, I
just choose that one. For this one, do_div() should be ok since that
code is not hot in most case. But all depends to your test results.
Regards,
Hao
Powered by blists - more mailing lists