[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201206173746.GN13361@riva.ucam.org>
Date: Sun, 6 Dec 2020 17:37:46 +0000
From: Colin Watson <cjwatson@...ntu.com>
To: "Theodore Y. Ts'o" <tytso@....edu>
Cc: Paul Menzel <pmenzel@...gen.mpg.de>,
Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org, Dimitri John Ledkov <xnox@...ntu.com>
Subject: Re: ext4: Funny characters appended to file names
On Sun, Dec 06, 2020 at 10:15:27AM -0500, Theodore Y. Ts'o wrote:
> On Sun, Dec 06, 2020 at 02:44:16PM +0000, Colin Watson wrote:
> > Now that I look at it more closely, some of the changes to
> > clean_grub_dir_real look suspicious:
> >
> > + char *srcf = grub_util_path_concat (2, di, de->d_name);
> > +
> > + if (mode == CREATE_BACKUP)
> > + {
> > + char *dstf = grub_util_path_concat_ext (2, di, de->d_name, "-");
> > + if (grub_util_rename (srcf, dstf) < 0)
> > + grub_util_error (_("cannot backup `%s': %s"), srcf,
> > + grub_util_fd_strerror ());
> > + free (dstf);
> > + }
>
> ... however, if I'm understanding the code correctly, this is the
> codepath used to create the backup file (e.g., the previous version of
> boot.img). So shouldn't there be a "boot.img" file in
> /boot/grub/i386-pc which would be the newly installed version of that
> file, and so the system would actually be booting correctly?
Not quite. What's described here as "backup/restore" thing is used as
follows:
* rename old modules aside as a backup
* do the rest of the installation (writing to the MBR or similar, as
well as copying in new modules)
* if installation succeeds, remove the backup files
* if installation fails, then:
* remove the newly-created modules
* move the backup files back into place
But if the restored file names are computed wrongly, then this leaves
the system in a bad state as Paul described.
I don't know why Dimitri chose to explicitly remove the new files first
rather than just renaming over the top and then removing any leftovers
at the end; that seems unnecessarily risky. Though this is code that's
apparently supposed to work on Windows as well, and the MoveFile
function that's used to implement grub_util_rename there requires the
destination file not to exist (sigh), so maybe it had something to do
with that.
> Essentially, there are three possibilities:
>
> 1) A hardware corruption which corrupted the directory.
>
> 2) A kernel bug which corrupted the directory.
>
> 3) The file system isn't actually corrupted, but the filename with the
> random garbage in the filename was created because a userspace
> application so requested it.
>
> The fact that all of the filenames have the a similar pattern of
> corruption to them would tend to rule out #1. And the fact that
> e2fsck didn't notice any other corruptions would tend to argue against
> #1 and #2. So #3 does seem to be the most likely.
Yep.
--
Colin Watson (he/him) [cjwatson@...ntu.com]
Powered by blists - more mailing lists