lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <cover.1607955523.git.dsterba@suse.com>
Date:   Mon, 14 Dec 2020 16:15:53 +0100
From:   David Sterba <dsterba@...e.com>
To:     torvalds@...ux-foundation.org
Cc:     David Sterba <dsterba@...e.com>, linux-btrfs@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: [GIT PULL] Btrfs updates for 5.11

Hi,

we have a mix of all kinds of changes, feature updates, core stuff,
performance improvements and lots of cleanups and preparatory changes.

There are no merge conflicts against current master branch, in past
weeks some conflicts emerged in linux-next but IIRC were trivial.
Please pull, thanks.

User visible:

- export filesystem generation in sysfs

- new features for mount option 'rescue'
  - what's currently supported is exported in sysfs
  - ignorebadroots/ibadroots - continue even if some essential tree
    roots are not usable (extent, uuid, data reloc, device, csum, free
    space)
  - ignoredatacsums/idatacsums - skip checksum verification on data
  - all - now enables ignorebadroots + ignoredatacsums + nologreplay

- export read mirror policy settings to sysfs, new policies will be
  added in the future

- remove inode number cache feature (mount -o inode_cache), obsoleted
  in 5.9

User visible fixes:

- async discard scheduling fixes on high loads

- update inode byte counter atomically so stat() does not report wrong
  value in some cases

- free space tree fixes
  - correctly report status of v2 after remount
  - clear v1 cache inodes when v2 is newly enabled after remount

Core:

- switch own tree lock implementation to standard rw semaphore
  - one-level lock nesting is not required anymore, the last use of this
    was in free space that's now loaded asynchronously
  - own implementation of adaptive spinning before taking mutex has been
    part of rwsem
  - performance seems to be better in general, much better (+tens of
    percents) for some workloads
  - lockdep does not complain

- finish direct IO conversion to iomap infrastructure, remove temporary
  workaround for DSYNC after iomap API updates

- preparatory work to support data and metadata blocks smaller than page
  - generalize code that assumes sectorsize == PAGE_SIZE, lots of
    refactoring
  - planned namely for 64K pages (eg. arm64, ppc64)
  - scrub read-only support

- preparatory work for zoned allocation mode (SMR/ZBC/ZNS friendly)
  - disable incompatible features
  - round-robin superblock write

- free space cache (v1) is loaded asynchronously, remove tree path
  recursion

- slightly improved time tacking for transaction kthread wake ups

Performance improvements (note that the numbers depend on load type or
other features and weren't run on the same machine):

- skip unnecessary work:
  - do not start readahead for csum tree when scrubbing non-data block
    groups
  - do not start and wait for delalloc on snapshot roots on transaction
    commit
  - fix race when defragmenting leads to unnecessary IO

- dbench speedups (+throughput%/-max latency%)
  - skip unnecessary searches for xattrs when logging an inode
    (+10.8/-8.2)
  - stop incrementing log batch when joining log transaction (1-2)
  - unlock path before checking if extent is shared during nocow
    writeback (+5.0/-20.5), on fio load +9.7% throughput/-9.8% runtime
  - several tree log improvements, eg. removing unnecessary operations,
    fixing races that lead to additional work (+12.7/-8.2)

- tree-checker error branches annotated with unlikely() (+3% throughput)

Other:

- cleanups

- lockdep fixes

- more btrfs_inode conversions

- error variable cleanups

----------------------------------------------------------------
The following changes since commit 0477e92881850d44910a7e94fc2c46f96faa131f:

  Linux 5.10-rc7 (2020-12-06 14:25:12 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.11-tag

for you to fetch changes up to b42fe98c92698d2a10094997e5f4d2dd968fd44f:

  btrfs: scrub: allow scrub to work with subpage sectorsize (2020-12-09 19:16:11 +0100)

----------------------------------------------------------------
Anand Jain (7):
      btrfs: sysfs: export filesystem generation
      btrfs: add helper for string match ignoring leading/trailing whitespace
      btrfs: create read policy framework
      btrfs: sysfs: add per-fs attribute for read policy
      btrfs: drop unused argument step from btrfs_free_extra_devids
      btrfs: drop never met disk total bytes check in verify_one_dev_extent
      btrfs: remove unused argument seed from btrfs_find_device

Boris Burkov (12):
      btrfs: lift read-write mount setup from mount and remount
      btrfs: start orphan cleanup on ro->rw remount
      btrfs: only mark bg->needs_free_space if free space tree is on
      btrfs: create free space tree on ro->rw remount
      btrfs: clear oneshot options on mount and remount
      btrfs: clear free space tree on ro->rw remount
      btrfs: keep sb cache_generation consistent with space_cache
      btrfs: use superblock state to print space_cache mount option
      btrfs: warn when remount will not change the free space tree
      btrfs: remove free space items when disabling space cache v1
      btrfs: skip space_cache v1 setup when not using it
      btrfs: fix lockdep warning when creating free space tree

David Sterba (22):
      btrfs: use the right number of levels for lockdep keysets
      btrfs: generate lockdep keyset names at compile time
      btrfs: send: use helpers to access root_item::ctransid
      btrfs: check-integrity: use proper helper to access btrfs_header
      btrfs: use root_item helpers for limit and flags in btrfs_create_tree
      btrfs: add set/get accessors for root_item::drop_level
      btrfs: remove unnecessary casts in printk
      btrfs: use precalculated sectorsize_bits from fs_info
      btrfs: replace div_u64 by shift in free_space_bitmap_size
      btrfs: replace s_blocksize_bits with fs_info::sectorsize_bits
      btrfs: store precalculated csum_size in fs_info
      btrfs: precalculate checksums per leaf once
      btrfs: use cached value of fs_info::csum_size everywhere
      btrfs: switch cached fs_info::csum_size from u16 to u32
      btrfs: remove unnecessary local variables for checksum size
      btrfs: check integrity: remove local copy of csum_size
      btrfs: scrub: remove local copy of csum_size from context
      btrfs: reorder extent buffer members for better packing
      btrfs: remove stub device info from messages when we have no fs_info
      btrfs: tree-checker: annotate all error branches as unlikely
      btrfs: drop casts of bio bi_sector
      btrfs: remove recalc_thresholds from free space ops

Filipe Manana (16):
      btrfs: assert we are holding the reada_lock when releasing a readahead zone
      btrfs: do not start readahead for csum tree when scrubbing non-data block groups
      btrfs: do not start and wait for delalloc on snapshot roots on transaction commit
      btrfs: refactor btrfs_drop_extents() to make it easier to extend
      btrfs: fix race when defragmenting leads to unnecessary IO
      btrfs: update the number of bytes used by an inode atomically
      btrfs: skip unnecessary searches for xattrs when logging an inode
      btrfs: stop incrementing log batch when joining log transaction
      btrfs: remove unnecessary attempt to drop extent maps after adding inline extent
      btrfs: unlock path before checking if extent is shared during nocow writeback
      btrfs: fix race causing unnecessary inode logging during link and rename
      btrfs: fix race that results in logging old extents during a fast fsync
      btrfs: fix race that causes unnecessary logging of ancestor inodes
      btrfs: fix race that makes inode logging fallback to transaction commit
      btrfs: fix race leading to unnecessary transaction commit when logging inode
      btrfs: do not block inode logging for so long during transaction commit

Goldwyn Rodrigues (14):
      btrfs: calculate num_pages, reserve_bytes once in btrfs_buffered_write
      btrfs: use iosize while reading compressed pages
      btrfs: use round_down while calculating start position in btrfs_dirty_pages()
      btrfs: set EXTENT_NORESERVE bits side btrfs_dirty_pages()
      btrfs: split btrfs_direct_IO to read and write
      btrfs: move pos increment and pagecache extension to btrfs_buffered_write
      btrfs: check FS error state bit early during write
      btrfs: introduce btrfs_write_check()
      btrfs: introduce btrfs_inode_lock()/unlock()
      btrfs: push inode locking and unlocking into buffered/direct write
      btrfs: use shared lock for direct writes within EOF
      btrfs: remove btrfs_inode::dio_sem
      btrfs: call iomap_dio_complete() without inode_lock
      btrfs: remove dio iomap DSYNC workaround

Josef Bacik (41):
      btrfs: unify the ro checking for mount options
      btrfs: push the NODATASUM check into btrfs_lookup_bio_sums
      btrfs: sysfs: export supported rescue= mount options
      btrfs: add a helper to print out rescue= options
      btrfs: show rescue=usebackuproot in /proc/mounts
      btrfs: introduce mount option rescue=ignorebadroots
      btrfs: introduce mount option rescue=ignoredatacsums
      btrfs: introduce mount option rescue=all
      btrfs: switch extent buffer tree lock to rw_semaphore
      btrfs: locking: remove all the blocking helpers
      btrfs: locking: rip out path->leave_spinning
      btrfs: do not shorten unpin len for caching block groups
      btrfs: update last_byte_to_unpin in switch_commit_roots
      btrfs: explicitly protect ->last_byte_to_unpin in unpin_extent_range
      btrfs: cleanup btrfs_discard_update_discardable usage
      btrfs: load free space cache into a temporary ctl
      btrfs: load the free space cache inode extents from commit root
      btrfs: load free space cache asynchronously
      btrfs: protect fs_info->caching_block_groups by block_group_cache_lock
      btrfs: remove lockdep classes for the fs tree
      btrfs: cleanup extent buffer readahead
      btrfs: use btrfs_read_node_slot in btrfs_realloc_node
      btrfs: use btrfs_read_node_slot in walk_down_reloc_tree
      btrfs: use btrfs_read_node_slot in do_relocation
      btrfs: use btrfs_read_node_slot in replace_path
      btrfs: use btrfs_read_node_slot in walk_down_tree
      btrfs: use btrfs_read_node_slot in qgroup_trace_extent_swap
      btrfs: use btrfs_read_node_slot in qgroup_trace_new_subtree_blocks
      btrfs: use btrfs_read_node_slot in btrfs_qgroup_trace_subtree
      btrfs: pass root owner to read_tree_block
      btrfs: pass the root owner and level around for readahead
      btrfs: pass the owner_root and level to alloc_extent_buffer
      btrfs: set the lockdep class for extent buffers on creation
      btrfs: cleanup the locking in btrfs_next_old_leaf
      btrfs: unlock to current level in btrfs_next_old_leaf
      btrfs: remove btrfs_path::recurse
      btrfs: locking: remove the recursion handling code
      btrfs: merge back btrfs_read_lock_root_node helpers
      btrfs: use btrfs_tree_read_lock in btrfs_search_slot
      btrfs: remove the recurse parameter from __btrfs_tree_read_lock
      btrfs: remove extent_buffer::recursed

Naohiro Aota (9):
      btrfs: introduce ZONED feature flag
      btrfs: get zone information of zoned block devices
      btrfs: check and enable ZONED mode
      btrfs: introduce max_zone_append_size
      btrfs: disallow space_cache in ZONED mode
      btrfs: disallow NODATACOW in ZONED mode
      btrfs: disable fallocate in ZONED mode
      btrfs: disallow mixed-bg in ZONED mode
      btrfs: implement log-structured superblock for ZONED mode

Nikolay Borisov (31):
      btrfs: use helpers to convert from seconds to jiffies in transaction_kthread
      btrfs: remove redundant time check in transaction kthread loop
      btrfs: record delta directly in transaction_kthread
      btrfs: calculate more accurate remaining time to sleep in transaction_kthread
      btrfs: open code insert_orphan_item
      btrfs: make btrfs_inode_safe_disk_i_size_write take btrfs_inode
      btrfs: make insert_prealloc_file_extent take btrfs_inode
      btrfs: make btrfs_truncate_inode_items take btrfs_inode
      btrfs: make btrfs_finish_ordered_io btrfs_inode-centric
      btrfs: make btrfs_delayed_update_inode take btrfs_inode
      btrfs: make btrfs_update_inode_item take btrfs_inode
      btrfs: make btrfs_update_inode take btrfs_inode
      btrfs: make maybe_insert_hole take btrfs_inode
      btrfs: make find_first_non_hole take btrfs_inode
      btrfs: make btrfs_insert_replace_extent take btrfs_inode
      btrfs: make btrfs_truncate_block take btrfs_inode
      btrfs: make btrfs_cont_expand take btrfs_inode
      btrfs: make btrfs_update_inode_fallback take btrfs_inode
      btrfs: merge __set_extent_bit and set_extent_bit
      btrfs: remove useless return value statement in split_node
      btrfs: simplify return values in setup_nodes_for_search
      btrfs: remove err variable from btrfs_delete_subvolume
      btrfs: eliminate err variable from merge_reloc_root
      btrfs: remove err variable from do_relocation
      btrfs: return bool from should_end_transaction
      btrfs: return bool from btrfs_should_end_transaction
      btrfs: move btrfs_find_highest_objectid/btrfs_find_free_objectid to disk-io.c
      btrfs: replace calls to btrfs_find_free_ino with btrfs_find_free_objectid
      btrfs: remove inode number cache feature
      btrfs: remove crc_check logic from free space
      btrfs: always set NODATASUM/NODATACOW in __create_free_space_inode

Pavel Begunkov (4):
      btrfs: discard: speed up async discard up to iops_limit
      btrfs: discard: store async discard delay as ns not as jiffies
      btrfs: don't miss async discards after scheduled work override
      btrfs: discard: reschedule work after sysfs param update

Qu Wenruo (41):
      btrfs: fix the comment on lock_extent_buffer_for_io
      btrfs: update the comment for find_first_extent_bit
      btrfs: sink the failed_start parameter to set_extent_bit
      btrfs: replace fs_info and private_data with inode in btrfs_wq_submit_bio
      btrfs: sink parameter start and len to check_data_csum
      btrfs: rename pages_locked in process_pages_contig()
      btrfs: only require sector size alignment for page read
      btrfs: rename page_size to io_size in submit_extent_page
      btrfs: assert page mapping lock in attach_extent_buffer_page
      btrfs: make buffer_radix take sector size units
      btrfs: grab fs_info from extent_buffer in btrfs_mark_buffer_dirty
      btrfs: make csum_tree_block() handle node smaller than page
      btrfs: extract extent buffer verification from btrfs_validate_metadata_buffer()
      btrfs: pass bvec to csum_dirty_buffer instead of page
      btrfs: scrub: distinguish scrub page from regular page
      btrfs: scrub: remove the force parameter from scrub_pages
      btrfs: scrub: refactor scrub_find_csum()
      btrfs: tests: remove invalid extent-io test
      btrfs: add structure to keep track of extent range in end_bio_extent_readpage
      btrfs: introduce helper to handle page status update in end_bio_extent_readpage()
      btrfs: use fixed width int type for extent_state::state
      btrfs: scrub: remove the anonymous structure from scrub_page
      btrfs: remove unused parameter phy_offset from btrfs_validate_metadata_buffer
      btrfs: only clear EXTENT_LOCK bit in extent_invalidatepage
      btrfs: use nodesize to determine if we need readahead in btrfs_lookup_bio_sums
      btrfs: use detach_page_private() in alloc_extent_buffer()
      btrfs: rename bio_offset of extent_submit_bio_start_t to dio_file_offset
      btrfs: pass bio_offset to check_data_csum() directly
      btrfs: make btrfs_verify_data_csum follow sector size
      btrfs: factor out btree page submission code to a helper
      btrfs: calculate inline extent buffer page size based on page size
      btrfs: don't allow tree block to cross page boundary for subpage support
      btrfs: update num_extent_pages to support subpage sized extent buffer
      btrfs: handle sectorsize < PAGE_SIZE case for extent buffer accessors
      btrfs: remove btrfs_find_ordered_sum call from btrfs_lookup_bio_sums
      btrfs: refactor btrfs_lookup_bio_sums to handle out-of-order bvecs
      btrfs: scrub: reduce width of extent_len/stripe_len from 64 to 32 bits
      btrfs: scrub: always allocate one full page for one sector for RAID56
      btrfs: scrub: support subpage tree block scrub
      btrfs: scrub: support subpage data scrub
      btrfs: scrub: allow scrub to work with subpage sectorsize

Tom Rix (1):
      btrfs: sysfs: remove unneeded semicolon

 fs/btrfs/Makefile                 |   3 +-
 fs/btrfs/backref.c                |  19 +-
 fs/btrfs/block-group.c            | 268 ++++++-------
 fs/btrfs/block-group.h            |   2 +
 fs/btrfs/block-rsv.c              |   8 +
 fs/btrfs/btrfs_inode.h            |  23 +-
 fs/btrfs/check-integrity.c        |  11 +-
 fs/btrfs/compression.c            |  28 +-
 fs/btrfs/ctree.c                  | 258 +++---------
 fs/btrfs/ctree.h                  | 213 +++++++---
 fs/btrfs/delayed-inode.c          |  23 +-
 fs/btrfs/delayed-inode.h          |   3 +-
 fs/btrfs/dev-replace.c            |  20 +-
 fs/btrfs/dir-item.c               |   1 -
 fs/btrfs/discard.c                |  46 ++-
 fs/btrfs/discard.h                |   3 +-
 fs/btrfs/disk-io.c                | 689 ++++++++++++++++++--------------
 fs/btrfs/disk-io.h                |  25 +-
 fs/btrfs/export.c                 |   1 -
 fs/btrfs/extent-io-tree.h         |  71 ++--
 fs/btrfs/extent-tree.c            | 111 ++----
 fs/btrfs/extent_io.c              | 656 ++++++++++++++++++------------
 fs/btrfs/extent_io.h              |  50 +--
 fs/btrfs/file-item.c              | 344 ++++++++++------
 fs/btrfs/file.c                   | 737 ++++++++++++++++++----------------
 fs/btrfs/free-space-cache.c       | 558 +++++++++++---------------
 fs/btrfs/free-space-cache.h       |  22 +-
 fs/btrfs/free-space-tree.c        |  26 +-
 fs/btrfs/inode-item.c             |   6 -
 fs/btrfs/inode-map.c              | 582 ---------------------------
 fs/btrfs/inode-map.h              |  16 -
 fs/btrfs/inode.c                  | 815 ++++++++++++++++++++------------------
 fs/btrfs/ioctl.c                  |  64 ++-
 fs/btrfs/locking.c                | 459 ++-------------------
 fs/btrfs/locking.h                |  24 +-
 fs/btrfs/ordered-data.c           |  45 ---
 fs/btrfs/ordered-data.h           |   5 +-
 fs/btrfs/print-tree.c             |  15 +-
 fs/btrfs/qgroup.c                 |  52 +--
 fs/btrfs/raid56.c                 |   8 +-
 fs/btrfs/reada.c                  |  34 +-
 fs/btrfs/ref-verify.c             |  27 +-
 fs/btrfs/reflink.c                |  18 +-
 fs/btrfs/relocation.c             | 116 ++----
 fs/btrfs/scrub.c                  | 340 +++++++++-------
 fs/btrfs/send.c                   |   6 +-
 fs/btrfs/struct-funcs.c           |  18 +-
 fs/btrfs/super.c                  | 179 ++++++---
 fs/btrfs/sysfs.c                  | 117 +++++-
 fs/btrfs/tests/btrfs-tests.c      |   3 +-
 fs/btrfs/tests/extent-io-tests.c  |  26 +-
 fs/btrfs/tests/free-space-tests.c |   1 -
 fs/btrfs/tests/qgroup-tests.c     |   4 -
 fs/btrfs/transaction.c            | 126 +++---
 fs/btrfs/transaction.h            |   3 +-
 fs/btrfs/tree-checker.c           | 337 ++++++++--------
 fs/btrfs/tree-defrag.c            |   1 -
 fs/btrfs/tree-log.c               | 183 +++++----
 fs/btrfs/uuid-tree.c              |   3 +-
 fs/btrfs/volumes.c                | 143 ++++---
 fs/btrfs/volumes.h                |  21 +-
 fs/btrfs/xattr.c                  |   8 +-
 fs/btrfs/zoned.c                  | 616 ++++++++++++++++++++++++++++
 fs/btrfs/zoned.h                  | 160 ++++++++
 include/uapi/linux/btrfs.h        |   1 +
 include/uapi/linux/btrfs_tree.h   |   3 +-
 66 files changed, 4491 insertions(+), 4313 deletions(-)
 delete mode 100644 fs/btrfs/inode-map.c
 delete mode 100644 fs/btrfs/inode-map.h
 create mode 100644 fs/btrfs/zoned.c
 create mode 100644 fs/btrfs/zoned.h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ