lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <ce9055d7-7301-0abe-3609-3a4e2e7b1e5e@gmail.com> Date: Sat, 14 Dec 2024 22:58:24 +0300 From: Nikolai Zhubr <zhubr.2@...il.com> To: Theodore Ts'o <tytso@....edu> Cc: linux-ext4@...r.kernel.org, stable@...r.kernel.org, linux-kernel@...r.kernel.org, jack@...e.cz Subject: Re: ext4 damage suspected in between 5.15.167 - 5.15.170 Hi Ted, On 12/13/24 19:12, Theodore Ts'o wrote: > stable@...nel.org" to the commit description. However, they are not > obligated to do that, so there is an auxillary system which uses AI to > intuit which patches might be a bug fix. There is also automated > systems that try to automatically figure out which patches might be Oh, so meanwhile it got even worse than I used to imagine :-) Thanks for pointing out. > Note that some hardware errors can be caused by one-off errors, such > as cosmic rays causing a bit-flip in memory DIMM. If that happens, > RAID won't save you, since the error was introduced before an updated Certainly cosmic rays is a possibility, but based on previous episodes I'd still rather bet on a more usual "subtle interaction" problem, either exact same or some similar to [1]. I even tried to run an existing test for this particular case as described in [2] but it is not too user-friendly and somehow exits abnormally without actually doing any interesting work. I'll get back to it later when I have some time. [1] https://lore.kernel.org/stable/20231205122122.dfhhoaswsfscuhc3@quack3/ [2] https://lwn.net/Articles/954364/ > The location of block allocation bitmaps never gets changed, so this > sort of thing only happens due to hardware-induced corruption. Well, unless e.g. some modified sectors start being flushed to random wrong offsets, like in [1] above, or something similar. > Looking at the dumpe2fs output, it looks like it was created > relatively recently (July 2024) but it doesn't have the metadata > checksum feature enabled, which has been enabled for quite a long Yes. That was intentional - for better compatibility with even more ancient stuff. Maybe time has come to reconsider the approach though. > You got lucky because it block allocation bitmap location was > corrupted to an obviously invalid value. But if it had been a Absolutely. I was really amazed when I realized that :-) It saved me days or even weeks of unnecessary verification work. > Otherwise, I strongly encourage you to learn, and to take > responsibility for the health of your own system. And ideally, you > can also use that knowledge to help other users out, which is the only > way the free-as-in-beer ecosystem can flurish; by having everybody True. Generally I try to follow that, as much as appears possible. It is sad a direct communication end-user-to-developer for solving issues is becoming increasingly problematic here. Anyway, thank you for friendly speech, useful hints and good references! Regards, Nick > helping each other. Who knows, maybe you could even get a job doing > it for a living. :-) :-) :-) > > Cheers, >
Powered by blists - more mailing lists