lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <EE83D424-A546-410D-B5ED-6E9631746ACF@gmail.com>
Date: Thu, 29 Aug 2024 17:50:50 +0200
From: Piotr Oniszczuk <piotr.oniszczuk@...il.com>
To: Yosry Ahmed <yosryahmed@...gle.com>
Cc: Pedro Falcato <pedro.falcato@...il.com>,
 Nhat Pham <nphamcs@...il.com>,
 Matthew Wilcox <willy@...radead.org>,
 Linux regressions mailing list <regressions@...ts.linux.dev>,
 LKML <linux-kernel@...r.kernel.org>,
 Johannes Weiner <hannes@...xchg.org>,
 Linux-MM <linux-mm@...ck.org>
Subject: Re: [regression] oops on heavy compilations ("kernel BUG at
 mm/zswap.c:1005!" and "Oops: invalid opcode: 0000")



> Wiadomość napisana przez Yosry Ahmed <yosryahmed@...gle.com> w dniu 27.08.2024, o godz. 20:48:
> 
> On Sun, Aug 25, 2024 at 9:24 AM Piotr Oniszczuk
> <piotr.oniszczuk@...il.com> wrote:
>> 
>> 
>> 
>>> Wiadomość napisana przez Pedro Falcato <pedro.falcato@...il.com> w dniu 25.08.2024, o godz. 17:05:
>>> 
>>> Also, could you try a memtest86 on your machine, to shake out potential hardware problems?
>> 
>> 
>> I found less time consuming way to trigger issue: 12c24t cross compile of llvm with „only 16G” of ram - as this triggers many heavy swappings (top swap usage gets 8-9G out of 16G swap part)
>> 
>> With such setup - on 6.9.12 - i’m getting not available system (due cpu soft lockup) just in 1..3h
>> (usually first or second compile iteration; i wrote simple scrip compiling in loop + counting interations)
> 
> Are we sure that the soft lockup problem is related to the originally
> reported problem? It seems like in v6.10 you hit a BUG in zswap
> (corruption?), and in v6.9 you hit a soft lockup with a zswap lock
> showing up in the splat. Not sure how they are relevant.

If so then i’m interpreting this as:

a\ 2 different bugs 

or 

b\ 6.10 issue is result of 6.9 bug

In such case i think we may:

1. fix 6.9 first (=get it stable for let say 30h continuous compil.)
2. apply fix to 6.10 then test stability on 6.10 

> 
> Is the soft lockup reproducible in v6.10 as well?
> 
> Since you have a narrow window (6.8.2 to 6.9) and a reproducer for the
> soft lockup problem, can you try bisecting?
> 
> Thanks!



May you pls help me with reducing amount of work here? 

1. by narrowing # of bisect iternations?
On my side each iteration is like
-build arch pkg
-install on builder
-compile till first hang (2..3h probably for bad) or 20h (for good)
this means days and i’m a bit short with time as all this is my hobby (so competes with all rest of my life...)

or

2. Ideally will be to have list of revert 6.9 commit candidates (starting from most probable falling commit)
i’ll revert and test

i’ll really appreciate help here….


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ