lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.2511121046350.25436@angie.orcam.me.uk>
Date: Wed, 12 Nov 2025 12:16:28 +0000 (GMT)
From: "Maciej W. Rozycki" <macro@...am.me.uk>
To: Thomas Bogendoerfer <tsbogend@...ha.franken.de>
cc: Nick Bowler <nbowler@...conx.ca>, Jiaxun Yang <jiaxun.yang@...goat.com>, 
    linux-mips@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] MIPS: mm: Prevent a TLB shutdown on initial
 uniquification

On Wed, 12 Nov 2025, Thomas Bogendoerfer wrote:

> >  Can you try the diagnostic patch below, which is what I used to verify 
> > this change, and report the entries produced?  Otherwise I wonder whether 
> > I haven't missed a barrier somewhere.
> 
> Update on the issue: Your patch is good and the segmentation faults,
> I'm seeing, have IMHO a different reason. Instead of removing the call
> to r4k_tlb_uniquify() I've replaced the jal in the binary with a nop.
> And the issue is still there with this patched kernel. I've seen
> something similair on a R12k Octanes, which comes and goes probably
> depeding on code layout. So far I wasn't able to nail this down :-(

 Oh dear!  Something to do with the cache?  Or code alignment perhaps?

 It reminds me of this stuff: 
<https://lore.kernel.org/r/Pine.GSO.3.96.1010625125007.20469D-100000@delta.ds2.pg.gda.pl/>.  

 Building a particular version of binutils freezed the machine solid ~11h 
into the build -- a power cycle was required (there's no hardware reset 
button).  At least it was fully reproducible and always at the same place 
in a `configure' script and changing the shell script in a trivial way, 
such as adding a new-line character, ahead of the place of the lock-up 
made the freeze go away.

 I used the machine's 8-position diagnostic LED display to debug this, by
making it show the syscall and hardware interrupt numbers as the exception 
handlers were entered, so as to narrow the origin down (only to realise 
later on I could have used a 1MiB NVRAM module the system has to store 
more data across a power cycle and retrieve it afterwards, a persistent 
kernel log of sorts).  IIRC it triggered in the exit(2) path.

 The most painful was the need to wait said ~11h for the next piece of 
data in debugging this.

 NB the machine in question is still alive in my lab.  Throwing memory SBE 
ECC errors again recently, but coping regardless, so more memory connector 
cleaning required upon next visit.

> Do you want to send a v2 of the patch ? I'm fine with the current version
> for applying...

 I'll send v2 with an update for the Wired register as we talked.  It may 
take a day or two.

  Maciej

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ