linux-kernel - [bisected] ext4 corruption on parisc since 6.12

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <84d7b3e1053b2a8397bcc7fc8eee8106@matoro.tk>
Date: Sun, 01 Dec 2024 19:26:52 -0500
From: matoro <matoro_mailinglist_kernel@...oro.tk>
To: Linux Parisc <linux-parisc@...r.kernel.org>, John David Anglin
 <dave@...isc-linux.org>, John David Anglin <dave.anglin@...l.net>,
 deller@...nel.org, Deller <deller@....de>, linmag7@...il.com, Sam James
 <sam@...too.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: [bisected] ext4 corruption on parisc since 6.12

Hi Helge, when booting 6.12 here myself and another user (CC'd) both observed 
our ext4 filesystems to be immediately corrupted in the same manner.

Every file that is read or written will have its access/modify times set to 
2446-05-10 18:38:55.0000, which is the maximum ext4 timestamp.  The 32-bit 
userspace doesn't seem to be able to handle this at all, as every further 
stat() call will error with "Value too large for defined data type".  
Unfortunately, simply rolling back to kernel 6.11 is insufficient to recover, 
as the filesystem corruption is persistent, and the errors come from 
userspace attempting to read the modified files.  I was able to recover with 
a command like:  find / -newermt 2446-01-01 -o -newerct 2446-01-01 -o 
-newerat 2446-01-01 | xargs touch -h

Luckily, lindholm was able to bisect and identified as the culprit commit:  
b5ff52be891347f8847872c49d7a5c2fa29400a7 ("parisc: Convert to generic 
clockevents").  Some other comments from the discussion:

17:20:37 <awilfox> would be curious if keeping that patch + CONFIG_SMP=n 
fixes it
17:20:44 <awilfox> this doesn't look necessarily correct on MP machines
17:23:56 <awilfox> time_keeper_id is now unused; the old code specifically 
marked the clocksource as unstable on MP machines despite having per_cpu 
before
17:24:11 <awilfox> and now it seems to imply CLOCK_EVT_FEAT_PERCPU is enough 
to work around it
17:24:13 <awilfox> maybe it isn't

Thanks!