lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1659357652.git.dsterba@suse.com>
Date:   Mon, 1 Aug 2022 18:40:03 +0200
From:   David Sterba <dsterba@...e.com>
To:     torvalds@...ux-foundation.org
Cc:     linux-btrfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [GIT PULL] Btrfs updates for 5.20

Hi,

this update brings some long awaited changes, the send protocol bump,
otherwise lots of small improvements and fixes. The main core part is
reworking bio handling, cleaning up the submission and endio and
improving error handling.

There are some non-btrfs patches adding helpers or updating API,
listed at the end of the changelog.

Please pull, thanks.

Features:

- sysfs:
  - export chunk size, in debug mode add tunable for setting its size
  - show zoned among features (was only in debug mode)
  - show commit stats (number, last/max/total duration)

- send protocol updated to 2
  - new commands:
    - ability write larger data chunks than 64K
    - send raw compressed extents (uses the encoded data ioctls), ie. no
      decompression on send side, no compression needed on receive side
      if supported
    - send 'otime' (inode creation time) among other timestamps
    - send file attributes (a.k.a file flags and xflags)
  - this is first version bump, backward compatibility on send and
    receive side is provided
  - there are still some known and wanted commands that will be
    implemented in the near future, another version bump will be needed,
    however we want to minimize that to avoid causing usability issues

- print checksum type and implementation at mount time

- don't print some messages at mount (mentioned as people asked about
  it), we want to print messages namely for new features so let's make
  some space for that
  - big metadata - this has been supported for a long time and is not a
                   feature that's worth mentioning
  - skinny metadata - same reason, set by default by mkfs

Performance improvements:

- reduced amount of reserved metadata for delayed items
  - when inserted items can be batched into one leaf
  - when deleting batched directory index items
  - when deleting delayed items used for deletion
  - overall improved count of files/sec, decreased subvolume lock
    contention

- metadata item access bounds checker micro-optimized, with a few
  percent of improved runtime for metadata-heavy operations

- increase direct io limit for read to 256 sectors, improved throughput
  by 3x on sample workload

Notable fixes:

- raid56
  - reduce parity writes, skip sectors of stripe when there are no data
    updates
  - restore reading from stripe cache instead of triggering new read

- refuse to replay log with unknown incompat read-only feature bit set

- zoned
  - fix page locking when COW fails in the middle of allocation
  - improved tracking of active zones, ZNS drives may limit the number
    and there are ENOSPC errors due to that limit and not actual lack of
    space
  - adjust maximum extent size for zone append so it does not cause late
    ENOSPC due to underreservation

- mirror reading error messages show the mirror number

- don't fallback to buffered IO for NOWAIT direct IO writes, we don't
  have the NOWAIT semantics for buffered io yet

- send, fix sending link commands for existing file paths when there are
  deleted and created hardlinks for same files

- repair all mirrors for profiles with more than 1 copy (raid1c34)

- fix repair of compressed extents, unify where error detection and
  repair happen

Core changes:

- bio completion cleanups
  - don't double defer compression bios
  - simplify endio workqueues
  - add more data to btrfs_bio to avoid allocation for read requests
  - rework bio error handling so it's same what block layer does, the
    submission works and errors are consumed in endio
  - when asynchronous bio offload fails fall back to synchronous
    checksum calculation to avoid errors under writeback or memory
    pressure

- new trace points
  - raid56 events
  - ordered extent operations

- super block log_root_transid deprecated (never used)

- mixed_backref and big_metadata sysfs feature files removed, they've
  been default for sufficiently long time, there are no known users and
  mixed_backref could be confused with mixed_groups

Non-btrfs changes, API updates:

- minor highmem API update to cover const arguments

- switch all kmap/kmap_atomic to kmap_local

- remove redundant flush_dcache_page()

- address_space_operations::writepage callback removed

- add bdev_max_segments() helper

----------------------------------------------------------------
The following changes since commit e0dccc3b76fb35bb257b4118367a883073d7390e:

  Linux 5.19-rc8 (2022-07-24 13:26:27 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.20-tag

for you to fetch changes up to 0b078d9db8793b1bd911e97be854e3c964235c78:

  btrfs: don't call btrfs_page_set_checked in finish_compressed_bio_read (2022-07-25 19:56:16 +0200)

----------------------------------------------------------------
BingJing Chang (2):
      btrfs: send: introduce recorded_ref_alloc and recorded_ref_free
      btrfs: send: fix sending link commands for existing file paths

Christoph Hellwig (37):
      btrfs: factor out a helper to end a single sector buffer I/O
      btrfs: refactor end_bio_extent_readpage code flow
      btrfs: factor out a btrfs_csum_ptr helper
      btrfs: use btrfs_bio_for_each_sector in btrfs_check_read_dio_bio
      btrfs: move more work into btrfs_end_bioc
      btrfs: simplify code flow in btrfs_submit_dio_bio
      btrfs: split btrfs_submit_data_bio to read and write parts
      btrfs: defer I/O completion based on the btrfs_raid_bio
      btrfs: don't double-defer bio completions for compressed reads
      btrfs: don't use btrfs_bio_wq_end_io for compressed writes
      btrfs: centralize setting REQ_META
      btrfs: remove btrfs_end_io_wq
      btrfs: factor stripe submission logic out of btrfs_map_bio
      btrfs: do not allocate a btrfs_bio for low-level bios
      btrfs: don't use bio->bi_private to pass the inode to submit_one_bio
      btrfs: merge end_write_bio and flush_write_bio
      btrfs: pass the btrfs_bio_ctrl to submit_one_bio
      btrfs: stop looking at btrfs_bio->iter in index_one_bio
      btrfs: split discard handling out of btrfs_map_block
      btrfs: remove the finish_func argument to btrfs_mark_ordered_io_finished
      btrfs: increase direct io read size limit to 256 sectors
      btrfs: remove extent writepage address space operation
      btrfs: raid56: use fixed stripe length everywhere
      btrfs: do not return errors from btrfs_map_bio
      btrfs: do not return errors from raid56_parity_write
      btrfs: do not return errors from raid56_parity_recover
      btrfs: raid56: transfer the bio counter reference to the raid submission helpers
      btrfs: simplify sync/async submission in btrfs_submit_data_write_bio
      btrfs: handle allocation failure in btrfs_wq_submit_bio gracefully
      btrfs: do not return errors from btrfs_submit_dio_bio
      btrfs: merge btrfs_dev_stat_print_on_error with its only caller
      btrfs: repair all known bad mirrors
      btrfs: simplify the pending I/O counting in struct compressed_bio
      btrfs: pass a btrfs_bio to btrfs_repair_one_sector
      btrfs: remove the start argument to check_data_csum and export
      btrfs: fix repair of compressed extents
      btrfs: don't call btrfs_page_set_checked in finish_compressed_bio_read

David Sterba (30):
      btrfs: fix typos in comments
      btrfs: remove redundant calls to flush_dcache_page
      btrfs: remove redundant check in up check_setget_bounds
      btrfs: sysfs: advertise zoned support among features
      btrfs: open code rbtree search in split_state
      btrfs: open code rbtree search in insert_state
      btrfs: lift start and end parameters to callers of insert_state
      btrfs: pass bits by value not by pointer for extent_state helpers
      btrfs: add fast path for extent_state insertion
      btrfs: remove node and parent parameters from insert_state
      btrfs: open code inexact rbtree search in tree_search
      btrfs: make tree search for insert more generic and use it for tree_search
      btrfs: unify tree search helper returning prev and next nodes
      btrfs: call inode_to_path directly and drop indirection
      btrfs: simplify parameters of backref iterators
      btrfs: sink iterator parameter to btrfs_ioctl_logical_to_ino
      btrfs: remove unused typedefs get_extent_t and btrfs_work_func_t
      btrfs: send: drop __KERNEL__ ifdef from send.h
      btrfs: send: simplify includes
      btrfs: send: remove old TODO regarding ERESTARTSYS
      btrfs: send: use boolean types for current inode status
      btrfs: send: add OTIME as utimes attribute for proto 2+ by default
      btrfs: send: add new command FILEATTR for file attributes
      btrfs: print checksum type and implementation at mount time
      btrfs: use mask for all RAID1* profiles in btrfs_calc_avail_data_space
      btrfs: merge calculations for simple striped profiles in btrfs_rmap_block
      btrfs: clean up chained assignments
      btrfs: switch btrfs_block_rsv::full to bool
      btrfs: switch btrfs_block_rsv::failfast to bool
      btrfs: use enum for btrfs_block_rsv::type

Fabio M. De Francesco (7):
      btrfs: replace kmap() with kmap_local_page() in inode.c
      btrfs: replace kmap() with kmap_local_page() in lzo.c
      highmem: Make __kunmap_{local,atomic}() take const void pointer
      btrfs: zstd: replace kmap() with kmap_local_page()
      btrfs: zlib: replace kmap() with kmap_local_page() in zlib_compress_pages()
      btrfs: zlib: replace kmap() with kmap_local_page() in zlib_decompress_bio()
      btrfs: replace kmap_atomic() with kmap_local_page()

Fanjun Kong (1):
      btrfs: use PAGE_ALIGNED instead of IS_ALIGNED

Filipe Manana (18):
      btrfs: balance btree dirty pages and delayed items after a rename
      btrfs: free the path earlier when creating a new inode
      btrfs: balance btree dirty pages and delayed items after clone and dedupe
      btrfs: add assertions when deleting batches of delayed items
      btrfs: deal with deletion errors when deleting delayed items
      btrfs: refactor the delayed item deletion entry point
      btrfs: improve batch deletion of delayed dir index items
      btrfs: assert that delayed item is a dir index item when adding it
      btrfs: improve batch insertion of delayed dir index items
      btrfs: do not BUG_ON() on failure to reserve metadata for delayed item
      btrfs: set delayed item type when initializing it
      btrfs: reduce amount of reserved metadata for delayed item insertion
      btrfs: remove the inode cache check at btrfs_is_free_space_inode()
      btrfs: don't fallback to buffered IO for NOWAIT direct IO writes
      btrfs: set the objectid of the btree inode's location key
      btrfs: add optimized btrfs_ino() version for 64 bits systems
      btrfs: send: always use the rbtree based inode ref management infrastructure
      btrfs: join running log transaction when logging new name

Ioannis Angelakopoulos (2):
      btrfs: collect commit stats, count, duration
      btrfs: sysfs: export commit stats

Johannes Thumshirn (1):
      btrfs: add tracepoints for ordered extents

Josef Bacik (3):
      btrfs: do not batch insert non-consecutive dir indexes during log replay
      btrfs: tree-log: make the return value for log syncing consistent
      btrfs: reset block group chunk force if we have to wait

Naohiro Aota (17):
      btrfs: ensure pages are unlocked on cow_file_range() failure
      btrfs: extend btrfs_cleanup_ordered_extents for NULL locked_page
      btrfs: fix error handling of fallback uncompress write
      btrfs: replace unnecessary goto with direct return at cow_file_range()
      block: add bdev_max_segments() helper
      btrfs: zoned: revive max_zone_append_bytes
      btrfs: replace BTRFS_MAX_EXTENT_SIZE with fs_info->max_extent_size
      btrfs: convert count_max_extents() to use fs_info->max_extent_size
      btrfs: use fs_info->max_extent_size in get_extent_max_capacity()
      btrfs: let can_allocate_chunk return error
      btrfs: zoned: finish least available block group on data bg allocation
      btrfs: zoned: introduce space_info->active_total_bytes
      btrfs: zoned: disable metadata overcommit for zoned
      btrfs: zoned: activate metadata block group on flush_space
      btrfs: zoned: activate necessary block group
      btrfs: zoned: write out partially allocated region
      btrfs: zoned: wait until zone is finished when allocation didn't progress

Nikolay Borisov (9):
      btrfs: introduce btrfs_try_lock_balance
      btrfs: use btrfs_try_lock_balance in btrfs_ioctl_balance
      btrfs: batch up release of reserved metadata for delayed items used for deletion
      btrfs: properly flag filesystem with BTRFS_FEATURE_INCOMPAT_BIG_METADATA
      btrfs: don't print 'flagging with big metadata' anymore on mount
      btrfs: don't print 'has skinny extents' anymore on mount
      btrfs: sysfs: remove MIXED_BACKREF feature file
      btrfs: sysfs: remove BIG_METADATA feature files
      btrfs: simplify error handling in btrfs_lookup_dentry

Omar Sandoval (7):
      btrfs: send: remove unused send_ctx::{total,cmd}_send_size
      btrfs: send: explicitly number commands and attributes
      btrfs: send: add stream v2 definitions
      btrfs: send: write larger chunks when using stream v2
      btrfs: send: get send buffer pages for protocol v2
      btrfs: send: send compressed extents with encoded writes
      btrfs: send: enable support for stream v2 and compressed writes

Pankaj Raghav (1):
      btrfs: zoned: fix comment description for sb_write_pointer logic

Qu Wenruo (25):
      btrfs: quit early if the fs has no RAID56 support for raid56 related checks
      btrfs: introduce a data checksum checking helper
      btrfs: remove duplicated parameters from submit_data_read_repair()
      btrfs: add a helper to iterate through a btrfs_bio with sector sized chunks
      btrfs: use integrated bitmaps for btrfs_raid_bio::dbitmap and finish_pbitmap
      btrfs: use integrated bitmaps for scrub_parity::dbitmap and ebitmap
      btrfs: only write the sectors in the vertical stripe which has data stripes
      btrfs: update stripe_sectors::uptodate in steal_rbio
      btrfs: add trace event for submitted RAID56 bio
      btrfs: make btrfs_super_block::log_root_transid deprecated
      btrfs: reject log replay if there is unsupported RO compat flag
      btrfs: raid56: avoid double for loop inside finish_rmw()
      btrfs: raid56: avoid double for loop inside __raid56_parity_recover()
      btrfs: raid56: avoid double for loop inside alloc_rbio_essential_pages()
      btrfs: raid56: avoid double for loop inside raid56_rmw_stripe()
      btrfs: raid56: avoid double for loop inside raid56_parity_scrub_stripe()
      btrfs: remove parameter dev_extent_len from scrub_stripe()
      btrfs: use btrfs_chunk_max_errors() to replace tolerance calculation
      btrfs: use btrfs_raid_array to calculate number of parity stripes
      btrfs: use ncopies from btrfs_raid_array in btrfs_num_copies()
      btrfs: use named constant for reserved device space
      btrfs: warn about dev extents that are inside the reserved range
      btrfs: raid56: don't trust any cached sector in __raid56_parity_recover()
      btrfs: output mirror number for bad metadata
      btrfs: return proper mapped length for RAID56 profiles in __btrfs_map_block()

Stefan Roesch (3):
      btrfs: store chunk size in space-info struct
      btrfs: sysfs: export chunk size in space infos
      btrfs: sysfs: add force_chunk_alloc trigger to force allocation

 arch/parisc/include/asm/cacheflush.h |   6 +-
 arch/parisc/kernel/cache.c           |   2 +-
 fs/btrfs/async-thread.h              |   1 -
 fs/btrfs/backref.c                   |  88 ++--
 fs/btrfs/backref.h                   |   3 +-
 fs/btrfs/block-group.c               |  34 +-
 fs/btrfs/block-rsv.c                 |  21 +-
 fs/btrfs/block-rsv.h                 |  15 +-
 fs/btrfs/btrfs_inode.h               |  25 +-
 fs/btrfs/compression.c               | 359 ++++----------
 fs/btrfs/compression.h               |  18 +-
 fs/btrfs/ctree.h                     | 105 ++++-
 fs/btrfs/delalloc-space.c            |   6 +-
 fs/btrfs/delayed-inode.c             | 395 +++++++++++-----
 fs/btrfs/delayed-inode.h             |  11 +
 fs/btrfs/delayed-ref.c               |   4 +-
 fs/btrfs/dev-replace.c               |   3 +-
 fs/btrfs/disk-io.c                   | 268 ++++-------
 fs/btrfs/disk-io.h                   |  17 +-
 fs/btrfs/extent-tree.c               | 149 +++---
 fs/btrfs/extent_io.c                 | 873 ++++++++++++++++-------------------
 fs/btrfs/extent_io.h                 |  15 +-
 fs/btrfs/file.c                      |  29 +-
 fs/btrfs/free-space-cache.c          |   3 +-
 fs/btrfs/inode.c                     | 764 +++++++++++++++---------------
 fs/btrfs/ioctl.c                     | 150 +++---
 fs/btrfs/lzo.c                       |  28 +-
 fs/btrfs/ordered-data.c              |  40 +-
 fs/btrfs/ordered-data.h              |   5 +-
 fs/btrfs/raid56.c                    | 792 +++++++++++++++----------------
 fs/btrfs/raid56.h                    | 168 ++++++-
 fs/btrfs/reflink.c                   |  19 +-
 fs/btrfs/scrub.c                     |  71 ++-
 fs/btrfs/send.c                      | 781 +++++++++++++++++++++----------
 fs/btrfs/send.h                      | 169 ++++---
 fs/btrfs/space-info.c                | 110 ++++-
 fs/btrfs/space-info.h                |   8 +-
 fs/btrfs/struct-funcs.c              |  11 +-
 fs/btrfs/subpage.c                   |   4 +-
 fs/btrfs/super.c                     |  36 +-
 fs/btrfs/sysfs.c                     | 186 +++++++-
 fs/btrfs/tests/btrfs-tests.c         |   1 +
 fs/btrfs/tests/extent-buffer-tests.c |   3 +-
 fs/btrfs/transaction.c               |  26 +-
 fs/btrfs/tree-log.c                  |  29 +-
 fs/btrfs/tree-log.h                  |   3 +
 fs/btrfs/volumes.c                   | 362 +++++++--------
 fs/btrfs/volumes.h                   |  46 +-
 fs/btrfs/zlib.c                      |  42 +-
 fs/btrfs/zoned.c                     | 131 +++++-
 fs/btrfs/zoned.h                     |  18 +
 fs/btrfs/zstd.c                      |  33 +-
 include/linux/blkdev.h               |   5 +
 include/linux/highmem-internal.h     |  10 +-
 include/trace/events/btrfs.h         | 158 +++++++
 include/uapi/linux/btrfs.h           |  10 +-
 mm/highmem.c                         |   2 +-
 57 files changed, 3842 insertions(+), 2829 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ