lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180319193221.GP6955@suse.cz>
Date:   Mon, 19 Mar 2018 20:32:21 +0100
From:   David Sterba <dsterba@...e.cz>
To:     Christoph Biedl <linux-kernel.bfrz@...chmal.in-ulm.de>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Anand Jain <anand.jain@...cle.com>,
        Liu Bo <bo.li.liu@...cle.com>, linux-btrfs@...r.kernel.org,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH 4.14 024/110] btrfs: use proper endianness accessors for
 super_copy

On Sat, Mar 17, 2018 at 06:27:15PM +0100, Christoph Biedl wrote:
> Greg Kroah-Hartman wrote...
> 
> > On Thu, Mar 15, 2018 at 07:55:42PM +0100, Christoph Biedl wrote:
> 
> > > > commit 3c181c12c431fe33b669410d663beb9cceefcd1b upstream.
> 
> > > On big-endian systems, this change intruduces severe corruption,
> > > resulting in complete loss of the data on the used block device.
> 
> > That sucks.  Can you test Linus's tree to verify the problem is there?
> > I'll gladly revert this if Linus's tree also gets the revert, I don't
> > want you to hit this when you upgrade to a newer kernel.
> 
> Confirmed: The problem is, err ... was in Linus' tree as well. The
> rather recent commit 8f5fd927c3a7 reverted the change, after that
> everything is as expected again.

Thanks for checking.

> Looking at the original commit, I don't have a clue why things go wrong
> so horribly

It's a half endianness conversion. The plain in-memory structures are in
LE and has to be accessed via the helpers, super block copy and the
root item.  The commit adds only one half of the conversion, that
naturally does not exhibit on LE, because the macros are no-op.

Originally, the items were stored from the on-disk type to on-disk type,
regardless of the CPU:

  super->chunk_root = root_item->bytenr;

The patch should have added conversion of both values, like

  btrfs_set_super_chunk_root(super, btrfs_root_bytenr(root_item));

and not

  btrfs_set_super_chunk_root(super, root_item->bytenr);

It's not very obvious from the context though, typically there's one
structure that needs the accessors and is set from a value that's in CPU
byteorder. I think that the exception to this pattern confused all
involved developers.

The root_item members are annotated as __le64 that should be caught by
sparse/smatch checker in the buggy case, but we don't run the checkers
every the time.

> - otherwise don't be afraid of my data. I took this as a
> chance to verify my data recovery procedure, with success.

Should that not be the case, the damaged items in superblock can be
byteswapped back. That's 6 x u64 at most, I have a tool for that now.

Thanks again for the report and sorry for the trouble.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ