lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0f63b072-840c-db5d-13cd-7faa554975d3@gmail.com>
Date:   Mon, 24 Jul 2023 20:22:28 +0100
From:   Pavel Begunkov <asml.silence@...il.com>
To:     Jens Axboe <axboe@...nel.dk>, Greg KH <gregkh@...uxfoundation.org>,
        Phil Elwell <phil@...pberrypi.com>
Cc:     andres@...razel.de, david@...morbit.com, hch@....de,
        io-uring@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
        linux-xfs@...r.kernel.org, stable <stable@...r.kernel.org>
Subject: Re: [PATCH] io_uring: Use io_schedule* in cqring wait

On 7/24/23 16:58, Jens Axboe wrote:
> On 7/24/23 9:50?AM, Jens Axboe wrote:
>> On 7/24/23 9:48?AM, Greg KH wrote:
>>> On Mon, Jul 24, 2023 at 04:35:43PM +0100, Phil Elwell wrote:
>>>> Hi Andres,
>>>>
>>>> With this commit applied to the 6.1 and later kernels (others not
>>>> tested) the iowait time ("wa" field in top) in an ARM64 build running
>>>> on a 4 core CPU (a Raspberry Pi 4 B) increases to 25%, as if one core
>>>> is permanently blocked on I/O. The change can be observed after
>>>> installing mariadb-server (no configuration or use is required). After
>>>> reverting just this commit, "wa" drops to zero again.
>>>
>>> This has been discussed already:
>>> 	https://lore.kernel.org/r/12251678.O9o76ZdvQC@natalenko.name
>>>
>>> It's not a bug, mariadb does have pending I/O, so the report is correct,
>>> but the CPU isn't blocked at all.
>>
>> Indeed - only thing I can think of is perhaps mariadb is having a
>> separate thread waiting on the ring in perpetuity, regardless of whether
>> or not it currently has IO.
>>
>> But yes, this is very much ado about nothing...
> 
> Current -git and having mariadb idle:
> 
> Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> Average:     all    0.00    0.00    0.04   12.47    0.04    0.00    0.00    0.00    0.00   87.44
> Average:       0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> Average:       1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> Average:       2    0.00    0.00    0.00    0.00    0.33    0.00    0.00    0.00    0.00   99.67
> Average:       3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> Average:       4    0.00    0.00    0.33    0.00    0.00    0.00    0.00    0.00    0.00   99.67
> Average:       5    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> Average:       6    0.00    0.00    0.00  100.00    0.00    0.00    0.00    0.00    0.00    0.00
> Average:       7    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 
> which is showing 100% iowait on one cpu, as mariadb has a thread waiting
> on IO. That is obviously a valid use case, if you split submission and
> completion into separate threads. Then you have the latter just always
> waiting on something to process.
> 
> With the suggested patch, we do eliminate that case and the iowait on
> that task is gone. Here's current -git with the patch and mariadb also
> running:
> 
> 09:53:49 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 09:53:50 AM  all    0.00    0.00    0.00    0.00    0.00    0.75    0.00    0.00    0.00   99.25
> 09:53:50 AM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 09:53:50 AM    1    0.00    0.00    0.00    0.00    0.00    1.00    0.00    0.00    0.00   99.00
> 09:53:50 AM    2    0.00    0.00    0.00    0.00    0.00    1.00    0.00    0.00    0.00   99.00
> 09:53:50 AM    3    0.00    0.00    0.00    0.00    0.00    1.00    0.00    0.00    0.00   99.00
> 09:53:50 AM    4    0.00    0.00    0.00    0.00    0.00    0.99    0.00    0.00    0.00   99.01
> 09:53:50 AM    5    0.00    0.00    0.00    0.00    0.00    1.00    0.00    0.00    0.00   99.00
> 09:53:50 AM    6    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 09:53:50 AM    7    0.00    0.00    0.00    0.00    0.00    1.00    0.00    0.00    0.00   99.00
> 
> 
> Even though I don't think this is an actual problem, it is a bit
> confusing that you get 100% iowait while waiting without having IO
> pending. So I do think the suggested patch is probably worthwhile
> pursuing. I'll post it and hopefully have Andres test it too, if he's
> available.

Emmm, what's the definition of the "IO" state? Unless we can say what exactly
it is there will be no end to adjustments, because I can easily argue that
CQ waiting by itself is IO.
Do we consider sleep(N) to be "IO"? I don't think the kernel uses io
schedule around that, and so it'd be different from io_uring waiting for
a timeout request. What about epoll waiting, etc.?

-- 
Pavel Begunkov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ