lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <07ef7b78-66d4-d3de-4e25-8a889b902e14@stancevic.com>
Date:   Tue, 5 Sep 2023 08:54:25 -0500
From:   Dragan Stancevic <dragan@...ncevic.com>
To:     Yu Kuai <yukuai1@...weicloud.com>, song@...nel.org
Cc:     buczek@...gen.mpg.de, guoqing.jiang@...ux.dev,
        it+raid@...gen.mpg.de, linux-kernel@...r.kernel.org,
        linux-raid@...r.kernel.org, msmith626@...il.com,
        "yangerkun@...wei.com" <yangerkun@...wei.com>
Subject: Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle"
 transition

On 9/4/23 22:50, Yu Kuai wrote:
> Hi,
> 
> 在 2023/08/30 9:36, Yu Kuai 写道:
>> Hi,
>>
>> 在 2023/08/29 4:32, Dragan Stancevic 写道:
>>
>>> Just a followup on 6.1 testing. I tried reproducing this problem for 
>>> 5 days with 6.1.42 kernel without your patches and I was not able to 
>>> reproduce it.
> 
> oops, I forgot that you need to backport this patch first to reporduce
> this problem:
> 
> https://lore.kernel.org/all/20230529132037.2124527-2-yukuai1@huaweicloud.com/
> 
> The patch fix the deadlock as well, but it introduce some regressions.

Ha, jinx :) I was about to email you that I isolated that change with 
the testing over the weekend that made it more difficult to reproduce in 
6.1 and that the original change must be reverted :)



> 
> Thanks,
> Kuai
> 
>>>
>>> It seems that 6.1 has some other code that prevents this from happening.
>>>
>>
>> I see that there are lots of patches for raid456 between 5.10 and 6.1,
>> however, I remember that I used to reporduce the deadlock after 6.1, and
>> it's true it's not easy to reporduce, see below:
>>
>> https://lore.kernel.org/linux-raid/e9067438-d713-f5f3-0d3d-9e6b0e9efa0e@huaweicloud.com/
>>
>> My guess is that 6.1 is harder to reporduce than 5.10 due to some
>> changes inside raid456.
>>
>> By the way, raid10 had a similiar deadlock, and can be fixed the same
>> way, so it make sense to backport these patches.
>>
>> https://lore.kernel.org/r/20230529132037.2124527-5-yukuai1@huaweicloud.com
>>
>> Thanks,
>> Kuai
>>
>>
>>> On 5.10 I can reproduce it within minutes to an hour.
>>>
>>
>> .
>>
> 

-- 
Peace can only come as a natural consequence
of universal enlightenment -Dr. Nikola Tesla

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ