linux-kernel - Re: stalling IO regression since linux 5.12, through 5.18

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6eece869-5cab-57b6-6f8f-98eaf65a742f@applied-asynchrony.com>
Date:   Wed, 17 Aug 2022 14:31:09 +0200
From:   Holger Hoffstätte <holger@...lied-asynchrony.com>
To:     Chris Murphy <lists@...orremedies.com>,
        Nikolay Borisov <nborisov@...e.com>,
        Jens Axboe <axboe@...nel.dk>, Jan Kara <jack@...e.cz>,
        Paolo Valente <paolo.valente@...aro.org>
Cc:     Linux-RAID <linux-raid@...r.kernel.org>,
        linux-block <linux-block@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Josef Bacik <josef@...icpanda.com>
Subject: Re: stalling IO regression since linux 5.12, through 5.18

On 2022-08-17 13:57, Chris Murphy wrote:
> 
> 
> On Wed, Aug 17, 2022, at 5:52 AM, Holger Hoffstätte wrote:
>> On 2022-08-16 17:34, Chris Murphy wrote:
>>>
>>> On Tue, Aug 16, 2022, at 11:25 AM, Nikolay Borisov wrote:
>>>> How about changing the scheduler either mq-deadline or noop, just
>>>> to see if this is also reproducible with a different scheduler. I
>>>> guess noop would imply the blk cgroup controller is going to be
>>>> disabled
>>>
>>> I already reported on that: always happens with bfq within an hour or
>>> less. Doesn't happen with mq-deadline for ~25+ hours. Does happen
>>> with bfq with the above patches removed. Does happen with
>>> cgroup.disabled=io set.
>>>
>>> Sounds to me like it's something bfq depends on and is somehow
>>> becoming perturbed in a way that mq-deadline does not, and has
>>> changed between 5.11 and 5.12. I have no idea what's under bfq that
>>> matches this description.
>>
>> Chris, just a shot in the dark but can you try the patch from
>>
>> https://lore.kernel.org/linux-block/20220803121504.212071-1-yukuai1@huaweicloud.com/
>>
>> on top of something more recent than 5.12? Ideally 5.19 where it applies
>> cleanly.
> 
> The problem doesn't reliably reproduce on 5.19. A patch for 5.12..5.18 would be much more testable.

If you look at the changes to sbitmap at:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/lib/sbitmap.c

you'll find that they are relatively recent, so Yukai's patch will probably also apply
to 5.18 - I don't know. Also look at the most recent commit which mentions
"Checking free bits when setting the target bits. Otherwise, it may reuse the busying bits."

Reusing the busy bits sounds "not great" either and (AFAIU) may also be a cause for
lost wakeups, but I'm sure Jan and Ming know all that better than me.

Especially Jan's suggestions re. disabling BFQ cgroup support is probably the easiest
thing to try first. What you're observing may not have a single root cause, and even if
it does, it might not be where we suspect.

-h