lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 22 May 2024 12:13:12 +0200
From: Marcin Wanat <private@...cinwanat.pl>
To: Zhaoyang Huang <huangzhaoyang@...il.com>
Cc: Dave Chinner <david@...morbit.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 "zhaoyang.huang" <zhaoyang.huang@...soc.com>, Alex Shi <alexs@...nel.org>,
 "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
 Hugh Dickins <hughd@...gle.com>, Baolin Wang
 <baolin.wang@...ux.alibaba.com>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, steve.kang@...soc.com
Subject: Re: [PATCH 1/1] mm: protect xa split stuff under lruvec->lru_lock
 during migration

On 22.05.2024 07:37, Zhaoyang Huang wrote:
> On Tue, May 21, 2024 at 11:47 PM Marcin Wanat <private@...cinwanat.pl> wrote:
>>
>> On 21.05.2024 03:00, Zhaoyang Huang wrote:
>>> On Tue, May 21, 2024 at 8:58 AM Zhaoyang Huang <huangzhaoyang@...il.com> wrote:
>>>>
>>>> On Tue, May 21, 2024 at 3:42 AM Marcin Wanat <private@...cinwanat.pl> wrote:
>>>>>
>>>>> On 15.04.2024 03:50, Zhaoyang Huang wrote:
>>>>> I have around 50 hosts handling high I/O (each with 20Gbps+ uplinks
>>>>> and multiple NVMe drives), running RockyLinux 8/9. The stock RHEL
>>>>> kernel 8/9 is NOT affected, and the long-term kernel 5.15.X is NOT affected.
>>>>> However, with long-term kernels 6.1.XX and 6.6.XX,
>>>>> (tested at least 10 different versions), this lockup always appears
>>>>> after 2-30 days, similar to the report in the original thread.
>>>>> The more load (for example, copying a lot of local files while
>>>>> serving 20Gbps traffic), the higher the chance that the bug will appear.
>>>>>
>>>>> I haven't been able to reproduce this during synthetic tests,
>>>>> but it always occurs in production on 6.1.X and 6.6.X within 2-30 days.
>>>>> If anyone can provide a patch, I can test it on multiple machines
>>>>> over the next few days.
>>>> Could you please try this one which could be applied on 6.6 directly. Thank you!
>>> URL: https://lore.kernel.org/linux-mm/20240412064353.133497-1-zhaoyang.huang@unisoc.com/
>>>
>>
>> Unfortunately, I am unable to cleanly apply this patch against the
>> latest 6.6.31
> Please try below one which works on my v6.6 based android. Thank you
> for your test in advance :D
> 
> mm/huge_memory.c | 22 ++++++++++++++--------
>   1 file changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c

I have compiled 6.6.31 with this patch and will test it on multiple 
machines over the next 30 days. I will provide an update after 30 days 
if everything is fine or sooner if any of the hosts experience the same 
soft lockup again.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ