lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230914182534.GD20408@twin.jikos.cz>
Date:   Thu, 14 Sep 2023 20:25:34 +0200
From:   David Sterba <dsterba@...e.cz>
To:     Johannes Thumshirn <johannes.thumshirn@....com>
Cc:     Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>,
        Christoph Hellwig <hch@....de>,
        Naohiro Aota <naohiro.aota@....com>, Qu Wenruo <wqu@...e.com>,
        Damien Le Moal <dlemoal@...nel.org>,
        linux-btrfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v9 00/11] btrfs: introduce RAID stripe tree

On Thu, Sep 14, 2023 at 09:06:55AM -0700, Johannes Thumshirn wrote:
> Updates of the raid-stripe-tree are done at ordered extent write time to safe
> on bandwidth while for reading we do the stripe-tree lookup on bio mapping
> time, i.e. when the logical to physical translation happens for regular btrfs
> RAID as well.
> 
> The stripe tree is keyed by an extent's disk_bytenr and disk_num_bytes and
> it's contents are the respective physical device id and position.
> 
> For an example 1M write (split into 126K segments due to zone-append)
> rapido2:/home/johannes/src/fstests# xfs_io -fdc "pwrite -b 1M 0 1M" -c fsync /mnt/test/test
> wrote 1048576/1048576 bytes at offset 0
> 1 MiB, 1 ops; 0.0065 sec (151.538 MiB/sec and 151.5381 ops/sec)
> 
> The tree will look as follows (both 128k buffered writes to a ZNS drive):
> 
> RAID0 case:
> bash-5.2# btrfs inspect-internal dump-tree -t raid_stripe /dev/nvme0n1
> btrfs-progs v6.3
> raid stripe tree key (RAID_STRIPE_TREE ROOT_ITEM 0) 
> leaf 805535744 items 1 free space 16218 generation 8 owner RAID_STRIPE_TREE
> leaf 805535744 flags 0x1(WRITTEN) backref revision 1
> checksum stored 2d2d2262
> checksum calced 2d2d2262
> fs uuid ab05cfc6-9859-404e-970d-3999b1cb5438
> chunk uuid c9470ba2-49ac-4d46-8856-438a18e6bd23
>         item 0 key (1073741824 RAID_STRIPE_KEY 131072) itemoff 16243 itemsize 56
>                         encoding: RAID0
>                         stripe 0 devid 1 offset 805306368 length 131072
>                         stripe 1 devid 2 offset 536870912 length 131072
> total bytes 42949672960
> bytes used 294912
> uuid ab05cfc6-9859-404e-970d-3999b1cb5438
> 
> RAID1 case:
> bash-5.2# btrfs inspect-internal dump-tree -t raid_stripe /dev/nvme0n1
> btrfs-progs v6.3
> raid stripe tree key (RAID_STRIPE_TREE ROOT_ITEM 0) 
> leaf 805535744 items 1 free space 16218 generation 8 owner RAID_STRIPE_TREE
> leaf 805535744 flags 0x1(WRITTEN) backref revision 1
> checksum stored 56199539
> checksum calced 56199539
> fs uuid 9e693a37-fbd1-4891-aed2-e7fe64605045
> chunk uuid 691874fc-1b9c-469b-bd7f-05e0e6ba88c4
>         item 0 key (939524096 RAID_STRIPE_KEY 131072) itemoff 16243 itemsize 56
>                         encoding: RAID1
>                         stripe 0 devid 1 offset 939524096 length 65536
>                         stripe 1 devid 2 offset 536870912 length 65536
> total bytes 42949672960
> bytes used 294912
> uuid 9e693a37-fbd1-4891-aed2-e7fe64605045
> 
> A design document can be found here:
> https://docs.google.com/document/d/1Iui_jMidCd4MVBNSSLXRfO7p5KmvnoQL/edit?usp=sharing&ouid=103609947580185458266&rtpof=true&sd=true

Please also turn it to developer documentation file (in
btrfs-progs/Documentation/dev), it can follow the same structure.

> 
> The user-space part of this series can be found here:
> https://lore.kernel.org/linux-btrfs/20230215143109.2721722-1-johannes.thumshirn@wdc.com
> 
> Changes to v8:
> - Changed tracepoints according to David's comments
> - Mark on-disk structures as packed
> - Got rid of __DECLARE_FLEX_ARRAY
> - Rebase onto misc-next
> - Split out helpers for new btrfs_load_block_group_zone_info RAID cases
> - Constify declarations where possible
> - Initialise variables before use
> - Lower scope of variables
> - Remove btrfs_stripe_root() helper
> - Pick different BTRFS_RAID_STRIPE_KEY constant
> - Reorder on-disk encoding types to match the raid_index
> - And possibly more, please git range-diff the versions
> - Link to v8: https://lore.kernel.org/r/20230911-raid-stripe-tree-v8-0-647676fa852c@wdc.com

v9 will be added as topic branch to for-next, I did several style
changes so please send any updates as incrementals if needed.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ