linux-kernel - Re: [PATCH v1] io_uring: Add support for napi_busy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <dc861c95-3150-03c7-4ecb-d86c53f7d8b3@linux.alibaba.com>
Date:   Tue, 1 Mar 2022 11:53:39 +0800
From:   Hao Xu <haoxu@...ux.alibaba.com>
To:     Olivier Langlois <olivier@...llion01.com>,
        Jens Axboe <axboe@...nel.dk>
Cc:     Pavel Begunkov <asml.silence@...il.com>,
        io-uring <io-uring@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1] io_uring: Add support for napi_busy_poll

在 2022/3/1 上午5:20, Olivier Langlois 写道:
> On Tue, 2022-03-01 at 02:34 +0800, Hao Xu wrote:
>>
>> On 2/25/22 23:32, Olivier Langlois wrote:
>>> On Fri, 2022-02-25 at 00:32 -0500, Olivier Langlois wrote:
>>>>>> +#ifdef CONFIG_NET_RX_BUSY_POLL
>>>>>> +static void io_adjust_busy_loop_timeout(struct timespec64
>>>>>> *ts,
>>>>>> +                                       struct io_wait_queue
>>>>>> *iowq)
>>>>>> +{
>>>>>> +       unsigned busy_poll_to =
>>>>>> READ_ONCE(sysctl_net_busy_poll);
>>>>>> +       struct timespec64 pollto = ns_to_timespec64(1000 *
>>>>>> (s64)busy_poll_to);
>>>>>> +
>>>>>> +       if (timespec64_compare(ts, &pollto) > 0) {
>>>>>> +               *ts = timespec64_sub(*ts, pollto);
>>>>>> +               iowq->busy_poll_to = busy_poll_to;
>>>>>> +       } else {
>>>>>> +               iowq->busy_poll_to = timespec64_to_ns(ts) /
>>>>>> 1000;
>>>>> How about timespec64_tons(ts) >> 10, since we don't need
>>>>> accurate
>>>>> number.
>>>> Fantastic suggestion! The kernel test robot did also detect an
>>>> issue
>>>> with that statement. I did discover do_div() in the meantime but
>>>> what
>>>> you suggest is better, IMHO...
>>> After having seen Jens patch (io_uring: don't convert to jiffies
>>> for
>>> waiting on timeouts), I think that I'll stick with do_div().
>>>
>>> I have a hard time considering removing timing accuracy when effort
>>> is
>>> made to make the same function more accurate...
>>
>>
>> I think they are different things. Jens' patch is to resolve the
>> problem
>>
>> that jiffies possibly can not stand for time < 1ms (when HZ is 1000).
>>
>> For example, a user assigns 10us, turn out to be 1ms, it's big
>> difference.
>>
>> But divided by 1000 or 1024 is not that quite different in this case.
>>
>>>
> idk... For every 100uSec slice, dividing by 1024 will introduce a
> ~2.4uSec error. I didn't dig enough the question to figure out if the
> error was smaller than the used clock accuracy.
> 
> but even if the error is small, why letting it slip in when 100%
> accurate value is possible?
> 
> Beside, making the painfully picky do_div() macro for some platforms
> happy, I fail to understand the problem with doing a division to get an
> accurate value.
> 
> let me reverse the question. Even if the bit shifting is a bit faster
> than doing the division, would the code be called often enough to make
> a significant difference?
It's just my personal preference: when a faster way is acceptable, I 
just choose that one. For this one, do_div() should be ok since that
code is not hot in most case. But all depends to your test results.

Regards,
Hao