lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAC8wJ3E+s7yiG+WUor6VPE1urUSoJ2fCqUZ09z2o+Xy047N3EA@mail.gmail.com>
Date:   Mon, 4 Jun 2018 16:20:45 +0200
From:   Maarten van Malland <maartenvanmalland@...il.com>
To:     "Theodore Y. Ts'o" <tytso@....edu>
Cc:     linux-ext4@...r.kernel.org
Subject: Re: Problem with external journal and LVM snapshots

Thanks a lot for your very elaborate and informative reply :). You
were spot-on with everything actually (including the initrd script
that I made) and the extra information helped me to understand what
really is going on here. I've tried your suggestion with the debugfs
command instead of using tune2fs and that worked beautifully. I hope
that this info will help others as well, as I couldn't find anything
on google about this. Thanks again.

On Mon, Jun 4, 2018 at 12:14 AM, Theodore Y. Ts'o <tytso@....edu> wrote:
> On Fri, Jun 01, 2018 at 12:47:05PM +0200, Maarten van Malland wrote:
>> I have a not so common setup that IMHO triggers a bug in the Ext4 journal code. I have the following setup:
>>
>> - A mdadm RAID10 device with Bcache backing and LVM on top. This should actually not matter at all, but perhaps still worth mentioning.
>> - The Ext4 volume resides on a LVM VG, with an external journal on a NVMe drive.
>> - I use LVM snapshotting for that volume
>>
>> Now, when I make the snapshot I do the following:
>>
>> lvremove /dev/bcache/root-snap
>> lvcreate -c 512 -I 512 -n root-snap -L 250G -s /dev/bcache/root
>> tune2fs -O ^has_journal /dev/bcache/root-snap (to get rid of the external journal)
>> tune2fs -O has_journal /dev/bcache/root-snap (to create a new internal journal)
>>
>> When finished, I can mount /dev/bcache/root-snap just fine, with the
>> internal journal working. However, when I reboot it's a different
>> issue. For whatever reason the kernel still sees both
>> /dev/bcache/root and /dev/bcache/root-snap with an external journal!
>
> I suspect that's not what is going on.  The problem is that external
> journals predate snapshot support, and external journals aren't very
> well supported in the first place, because so few people use them.
>
> The other thing to understand about external journals is that both the
> external journal and the file system each have a UUID, and the file
> system superblock, in addition to its UUID, has the UUID for the
> external journal which is it using.  And the external journal, in
> addition to its UUID, has a list of UUID's for the file systems that
> is using the external journal.  (There is partial support to allow
> multiple file systems to use the same journal; which was never
> completed.)
>
> So when you created the snapshot:
>
>   lvremove /dev/bcache/root-snap
>   lvcreate -c 512 -I 512 -n root-snap -L 250G -s /dev/bcache/root
>
> This created a new block device which had the same file system UUID as
> the orignal file system.  When you then attempted to remove the
> external journal:
>
>   tune2fs -O ^has_journal /dev/bcache/root-snap
>
> ... this cleared the external journal's UUID from
> /dev/bcache/root-snap.  However, this *also* removed the UUID of
> /dev/bcache/root and /dev/bcache/root-snap from the external journal.
>
> This was fine while /dev/bcache/root remains mounted.  But then when
> you next tried to remount /deb/bcache/root, the mount would have
> failed, because while /deb/bcache/root has a pointer (via a UUID) to
> the external journal, the external journal no longer has a
> back-pointer (via UUID) to /dev/bcache/root.
>
> You didn't say what the script in initrd was that fixed it, but I'm
> guessing it was something like:
>
>    tune2fs -O ^has_journal /dev/bcache/root
>
> Which would have resulted in the warning message:
>
> tune2fs 1.44.2 (14-May-2018)
> Filesystem's UUID not found on journal device.  <======
> Journal removed
>
> Followed by something like:
>
>    tune2fs -J device=/dev/bcache/journal /deb/bcache/root
>
>
> The fundamental problem is that there is deep assumption that file
> system UUID's are unique.  This is needed for mounting-by-uuid to
> work, for example.  Creating snapshots which aren't emphameral breaks
> this assumption so it's not just external journals which have this
> problem.  If you have "UUID=xxxx" in your /etc/fstab, it's going to
> cause confusion as well.
>
> So the quick workaround for your problem is to use this instead of
> "tune2fs -O ^has_journal /dev/bcache/root-snap":
>
> debugfs -w /deb/bcache/root-snap << EOF
> features ^has_journal
> set_super_value journal_uuid null
> set_super_value journal_dev 0
> quit
> EOF
>
> Regards,
>
>                                         - Ted

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ