lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210219041244.GZ7193@magnolia>
Date:   Thu, 18 Feb 2021 20:12:44 -0800
From:   "Darrick J. Wong" <djwong@...nel.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     linux-fsdevel@...r.kernel.org, linux-xfs@...r.kernel.org,
        david@...morbit.com, linux-kernel@...r.kernel.org,
        sandeen@...deen.net, hch@....de
Subject: [GIT PULL] xfs: new code for 5.12

Hi Linus,

Please pull the following branch containing all the new xfs code for
5.12.  There's a lot going on this time, which seems about right for
this drama-filled year.

Community developers added some code to speed up freezing when read-only
workloads are still running, refactored the logging code, added checks
to prevent file extent counter overflow, reduced iolock cycling to speed
up fsync and gc scans, and started the slow march towards supporting
filesystem shrinking.

There's a huge refactoring of the internal speculative preallocation
garbage collection code which fixes a bunch of bugs, makes the gc
scheduling per-AG and hence multithreaded, and standardizes the retry
logic when we try to reserve space or quota, can't, and want to trigger
a gc scan.  We also enable multithreaded quotacheck to reduce mount
times further.  This is also preparation for background file gc, which
may or may not land for 5.13.

We also fixed some deadlocks in the rename code, fixed a quota
accounting leak when FSSETXATTR fails, restored the behavior that write
faults to an mmap'd region actually cause a SIGBUS, fixed a bug where
sgid directory inheritance wasn't quite working properly, and fixed a
bug where symlinks weren't working properly in ecryptfs.  We also now
advertise the inode btree counters feature that was introduced two
cycles ago.

This branch merges cleanly with 5.11, but there were a few merge
conflicts with the pidfd tree that Stephen Rothwell noticed in for-next.
Christian Brauner is trying to create per-mount id mappings, which
apparently requires passing the per-mount user namespace deep into the
filesystems, either directly or through struct files.

The first conflict arises from Christoph's fix for gid inheritance; I
think it can be resolved as follows:

diff --cc fs/xfs/xfs_inode.c
index 636ac13b1df2,95b7f2ba4e06..000000000000
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@@ -809,13 -810,13 +810,13 @@@ xfs_init_new_inode
  	inode->i_rdev = rdev;
  	ip->i_d.di_projid = prid;
  
 -	if (pip && XFS_INHERIT_GID(pip)) {
 -		inode->i_gid = VFS_I(pip)->i_gid;
 -		if ((VFS_I(pip)->i_mode & S_ISGID) && S_ISDIR(mode))
 -			inode->i_mode |= S_ISGID;
 +	if (dir && !(dir->i_mode & S_ISGID) &&
 +	    (mp->m_flags & XFS_MOUNT_GRPID)) {
- 		inode->i_uid = current_fsuid();
++		inode->i_uid = fsuid_into_mnt(mnt_userns);
 +		inode->i_gid = dir->i_gid;
 +		inode->i_mode = mode;
  	} else {
- 		inode_init_owner(inode, dir, mode);
 -		inode->i_gid = fsgid_into_mnt(mnt_userns);
++		inode_init_owner(mnt_userns, inode, dir, mode);
  	}
  
  	/*

I think the important bits here are making sure the previous
current_fs[ug]id() calls get turned into fs[ug]id_into_mnt() calls, and
making sure the mnt_userns pointer gets passed to inode_init_owner().

The second conflict involves the quota reservation rework patchset, and
I think it can be resolved as follows:

diff --cc fs/xfs/xfs_ioctl.c
index 248083ea0276,3d4c7ca080fb..000000000000
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@@ -1275,9 -1280,9 +1280,10 @@@ xfs_ioctl_setattr_prepare_dax
   */
  static struct xfs_trans *
  xfs_ioctl_setattr_get_trans(
- 	struct xfs_inode	*ip,
 -	struct file		*file)
++	struct file		*file,
 +	struct xfs_dquot	*pdqp)
  {
+ 	struct xfs_inode	*ip = XFS_I(file_inode(file));
  	struct xfs_mount	*mp = ip->i_mount;
  	struct xfs_trans	*tp;
  	int			error = -EROFS;
@@@ -1461,9 -1470,9 +1469,9 @@@ xfs_ioctl_setattr
  
  	xfs_ioctl_setattr_prepare_dax(ip, fa);
  
- 	tp = xfs_ioctl_setattr_get_trans(ip, pdqp);
 -	tp = xfs_ioctl_setattr_get_trans(file);
++	tp = xfs_ioctl_setattr_get_trans(file, pdqp);
  	if (IS_ERR(tp)) {
 -		code = PTR_ERR(tp);
 +		error = PTR_ERR(tp);
  		goto error_free_dquots;
  	}
  
@@@ -1599,7 -1615,7 +1606,7 @@@ xfs_ioc_setxflags
  
  	xfs_ioctl_setattr_prepare_dax(ip, &fa);
  
- 	tp = xfs_ioctl_setattr_get_trans(ip, NULL);
 -	tp = xfs_ioctl_setattr_get_trans(filp);
++	tp = xfs_ioctl_setattr_get_trans(filp, NULL);
  	if (IS_ERR(tp)) {
  		error = PTR_ERR(tp);
  		goto out_drop_write;

Mr. Brauner swapped the xfs_inode pointer in the first argument of
xfs_ioctl_setattr_get_trans for a struct file, and I added a second
argument to pass a xfs_dquot that we're making reservations against into
the get_trans function.  The rest of the diff updates the callsite
parameters.

After the merge, the function signature should be:

static struct xfs_trans *
xfs_ioctl_setattr_get_trans(
	struct file		*file,
	struct xfs_dquot	*pdqp) {...}

The third conflict is also from the quota rework patchset, and (AFAICT)
auto-resolved like this:

diff --cc fs/xfs/xfs_inode.c
index 636ac13b1df2,95b7f2ba4e06..000000000000
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@@ -1159,12 -1167,16 +1166,12 @@@ xfs_create_tmpfile
  	resblks = XFS_IALLOC_SPACE_RES(mp);
  	tres = &M_RES(mp)->tr_create_tmpfile;
  
 -	error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
 +	error = xfs_trans_alloc_icreate(mp, tres, udqp, gdqp, pdqp, resblks,
 +			&tp);
  	if (error)
 -		goto out_release_inode;
 -
 -	error = xfs_trans_reserve_quota(tp, mp, udqp, gdqp,
 -						pdqp, resblks, 1, 0);
 -	if (error)
 -		goto out_trans_cancel;
 +		goto out_release_dquots;
  
- 	error = xfs_dir_ialloc(&tp, dp, mode, 0, 0, prid, &ip);
+ 	error = xfs_dir_ialloc(mnt_userns, &tp, dp, mode, 0, 0, prid, &ip);
  	if (error)
  		goto out_trans_cancel;
  
All that is going on here is adding the mnt_userns parameter as the
first argument to xfs_dir_ialloc; I think the only reason my test merge
noticed it is because it's adjacent to a different change that I made.

With those pieces fixed up, the tree builds and seems to pass the simple
fstest run that I did.  Please let me know if anything else strange
happens during the merge process, particularly since there usually
aren't merge conflicts. :)

I will probably follow this up in a day or two with a couple more fixes
that have trickled in, but I am still catching up after the same ice
storm that knocked you in the dark(?) last Sunday also knocked me
offline until yesterday afternoon.  It's a bit disconcerting that a
single evening's ice storm could cut power to 10% of the state's
population and overload the cell phone networks to the point of
unusability.

--D

The following changes since commit 19c329f6808995b142b3966301f217c831e7cf31:

  Linux 5.11-rc4 (2021-01-17 16:37:05 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/xfs-5.12-merge-5

for you to fetch changes up to 1cd738b13ae9b29e03d6149f0246c61f76e81fcf:

  xfs: consider shutdown in bmapbt cursor delete assert (2021-02-11 08:46:38 -0800)

----------------------------------------------------------------
New code for 5.12:
- Fix an ABBA deadlock when renaming files on overlayfs.
- Make sure that we can't overflow the inode extent counters when adding
  to or removing extents from a file.
- Make directory sgid inheritance work the same way as all the other
  filesystems.
- Don't drain the buffer cache on freeze and ro remount, which should
  reduce the amount of time if read-only workloads are continuing
  during the freeze.
- Fix a bug where symlink size isn't reported to the vfs in ecryptfs.
- Disentangle log cleaning from log covering.  This refactoring sets us
  up for future changes to the log, though for now it simply means that
  we can use covering for freezes, and cleaning becomes something we
  only do at unmount.
- Speed up file fsyncs by reducing iolock cycling.
- Fix delalloc blocks leaking when changing the project id fails because
  of input validation errors in FSSETXATTR.
- Fix oversized quota reservation when converting unwritten extents
  during a DAX write.
- Create a transaction allocation helper function to standardize the
  idiom of allocating a transaction, reserving blocks, locking inodes,
  and reserving quota.  Replace all the open-coded logic for file
  creation, file ownership changes, and file modifications to use them.
- Actually shut down the fs if the incore quota reservations get
  corrupted.
- Fix background block garbage collection scans to not block and to
  actually clean out CoW staging extents properly.
- Run block gc scans when we run low on project quota.
- Use the standardized transaction allocation helpers to make it so that
  ENOSPC and EDQUOT errors during reservation will back out, invoke the
  block gc scanner, and try again.  This is preparation for introducing
  background inode garbage collection in the next cycle.
- Combine speculative post-EOF block garbage collection with speculative
  copy on write block garbage collection.
- Enable multithreaded quotacheck.
- Allow sysadmins to tweak the CPU affinities and maximum concurrency
  levels of quotacheck and background blockgc worker pools.
- Expose the inode btree counter feature in the fs geometry ioctl.
- Cleanups of the growfs code in preparation for starting work on
  filesystem shrinking.
- Fix all the bloody gcc warnings that the maintainer knows about. :P
- Fix a RST syntax error.
- Don't trigger bmbt corruption assertions after the fs shuts down.
- Restore behavior of forcing SIGBUS on a shut down filesystem when
  someone triggers a mmap write fault (or really, any buffered write).

----------------------------------------------------------------
Brian Foster (14):
      xfs: rename xfs_wait_buftarg() to xfs_buftarg_drain()
      xfs: don't drain buffer lru on freeze and read-only remount
      xfs: sync lazy sb accounting on quiesce of read-only mounts
      xfs: lift writable fs check up into log worker task
      xfs: separate log cleaning from log quiesce
      xfs: cover the log during log quiesce
      xfs: don't reset log idle state on covering checkpoints
      xfs: fold sbcount quiesce logging into log covering
      xfs: remove duplicate wq cancel and log force from attr quiesce
      xfs: remove xfs_quiesce_attr()
      xfs: cover the log on freeze instead of cleaning it
      xfs: fix unused log variable in xfs_log_cover()
      xfs: restore shutdown check in mapped write fault path
      xfs: consider shutdown in bmapbt cursor delete assert

Chandan Babu R (17):
      xfs: Add helper for checking per-inode extent count overflow
      xfs: Check for extent overflow when trivally adding a new extent
      xfs: Check for extent overflow when punching a hole
      xfs: Check for extent overflow when adding dir entries
      xfs: Check for extent overflow when removing dir entries
      xfs: Check for extent overflow when renaming dir entries
      xfs: Check for extent overflow when adding/removing xattrs
      xfs: Check for extent overflow when writing to unwritten extent
      xfs: Check for extent overflow when moving extent from cow to data fork
      xfs: Check for extent overflow when remapping an extent
      xfs: Check for extent overflow when swapping extents
      xfs: Introduce error injection to reduce maximum inode fork extent count
      xfs: Remove duplicate assert statement in xfs_bmap_btalloc()
      xfs: Compute bmap extent alignments in a separate function
      xfs: Process allocated extent in a separate function
      xfs: Introduce error injection to allocate only minlen size extents for files
      xfs: Fix 'set but not used' warning in xfs_bmap_compute_alignments()

Christoph Hellwig (3):
      xfs: fix up non-directory creation in SGID directories
      xfs: refactor xfs_file_fsync
      xfs: reduce ilock acquisitions in xfs_file_fsync

Darrick J. Wong (44):
      xfs: fix an ABBA deadlock in xfs_rename
      xfs: fix chown leaking delalloc quota blocks when fssetxattr fails
      xfs: reduce quota reservation when doing a dax unwritten extent conversion
      xfs: clean up quota reservation callsites
      xfs: create convenience wrappers for incore quota block reservations
      xfs: remove xfs_trans_unreserve_quota_nblks completely
      xfs: clean up icreate quota reservation calls
      xfs: fix up build warnings when quotas are disabled
      xfs: reserve data and rt quota at the same time
      xfs: refactor common transaction/inode/quota allocation idiom
      xfs: allow reservation of rtblocks with xfs_trans_alloc_inode
      xfs: refactor reflink functions to use xfs_trans_alloc_inode
      xfs: refactor inode creation transaction/inode/quota allocation idiom
      xfs: refactor inode ownership change transaction/inode/quota allocation idiom
      xfs: remove xfs_qm_vop_chown_reserve
      xfs: rename code to error in xfs_ioctl_setattr
      xfs: shut down the filesystem if we screw up quota reservation
      xfs: trigger all block gc scans when low on quota space
      xfs: don't stall cowblocks scan if we can't take locks
      xfs: xfs_inode_free_quota_blocks should scan project quota
      xfs: move and rename xfs_inode_free_quota_blocks to avoid conflicts
      xfs: pass flags and return gc errors from xfs_blockgc_free_quota
      xfs: try worst case space reservation upfront in xfs_reflink_remap_extent
      xfs: flush eof/cowblocks if we can't reserve quota for file blocks
      xfs: flush eof/cowblocks if we can't reserve quota for inode creation
      xfs: flush eof/cowblocks if we can't reserve quota for chown
      xfs: add a tracepoint for blockgc scans
      xfs: refactor xfs_icache_free_{eof,cow}blocks call sites
      xfs: flush speculative space allocations when we run out of space
      xfs: increase the default parallelism levels of pwork clients
      xfs: set WQ_SYSFS on all workqueues in debug mode
      xfs: relocate the eofb/cowb workqueue functions
      xfs: hide xfs_icache_free_eofblocks
      xfs: hide xfs_icache_free_cowblocks
      xfs: remove trivial eof/cowblocks functions
      xfs: consolidate incore inode radix tree posteof/cowblocks tags
      xfs: consolidate the eofblocks and cowblocks workers
      xfs: only walk the incore inode tree once per blockgc scan
      xfs: rename block gc start and stop functions
      xfs: parallelize block preallocation garbage collection
      xfs: expose the blockgc workqueue knobs publicly
      xfs: don't bounce the iolock between free_{eof,cow}blocks
      xfs: fix incorrect root dquot corruption error when switching group/project quota types
      xfs: fix rst syntax error in admin guide

Eric Biggers (1):
      xfs: remove a stale comment from xfs_file_aio_write_checks()

Gao Xiang (2):
      xfs: rename `new' to `delta' in xfs_growfs_data_private()
      xfs: get rid of xfs_growfs_{data,log}_t

Jeffrey Mitchell (1):
      xfs: set inode size after creating symlink

Yumei Huang (1):
      xfs: Fix assert failure in xfs_setattr_size()

Zorro Lang (1):
      libxfs: expose inobtcount in xfs geometry

kernel test robot (1):
      xfs: fix boolreturn.cocci warnings

 Documentation/admin-guide/xfs.rst |  42 ++++
 fs/xfs/libxfs/xfs_alloc.c         |  50 +++++
 fs/xfs/libxfs/xfs_alloc.h         |   3 +
 fs/xfs/libxfs/xfs_attr.c          |  22 +-
 fs/xfs/libxfs/xfs_bmap.c          | 315 ++++++++++++++++++---------
 fs/xfs/libxfs/xfs_btree.c         |  33 ++-
 fs/xfs/libxfs/xfs_dir2.h          |   2 -
 fs/xfs/libxfs/xfs_dir2_sf.c       |   2 +-
 fs/xfs/libxfs/xfs_errortag.h      |   6 +-
 fs/xfs/libxfs/xfs_fs.h            |   1 +
 fs/xfs/libxfs/xfs_inode_fork.c    |  27 +++
 fs/xfs/libxfs/xfs_inode_fork.h    |  63 ++++++
 fs/xfs/libxfs/xfs_sb.c            |   2 +
 fs/xfs/scrub/common.c             |   4 +-
 fs/xfs/xfs_bmap_item.c            |  10 +
 fs/xfs/xfs_bmap_util.c            |  81 ++++---
 fs/xfs/xfs_buf.c                  |  30 ++-
 fs/xfs/xfs_buf.h                  |  11 +-
 fs/xfs/xfs_dquot.c                |  47 +++-
 fs/xfs/xfs_error.c                |   6 +
 fs/xfs/xfs_file.c                 | 114 +++++-----
 fs/xfs/xfs_fsops.c                |  32 +--
 fs/xfs/xfs_fsops.h                |   4 +-
 fs/xfs/xfs_globals.c              |   7 +-
 fs/xfs/xfs_icache.c               | 438 ++++++++++++++++++++------------------
 fs/xfs/xfs_icache.h               |  24 +--
 fs/xfs/xfs_inode.c                | 134 ++++++++----
 fs/xfs/xfs_ioctl.c                |  75 +++----
 fs/xfs/xfs_iomap.c                |  53 ++---
 fs/xfs/xfs_iops.c                 |  28 +--
 fs/xfs/xfs_iwalk.c                |   5 +-
 fs/xfs/xfs_linux.h                |   3 +-
 fs/xfs/xfs_log.c                  | 142 +++++++++---
 fs/xfs/xfs_log.h                  |   4 +-
 fs/xfs/xfs_mount.c                |  43 +---
 fs/xfs/xfs_mount.h                |  10 +-
 fs/xfs/xfs_mru_cache.c            |   2 +-
 fs/xfs/xfs_pwork.c                |  25 +--
 fs/xfs/xfs_pwork.h                |   4 +-
 fs/xfs/xfs_qm.c                   | 116 ++--------
 fs/xfs/xfs_quota.h                |  49 +++--
 fs/xfs/xfs_reflink.c              | 109 ++++++----
 fs/xfs/xfs_rtalloc.c              |   5 +
 fs/xfs/xfs_super.c                |  82 +++----
 fs/xfs/xfs_super.h                |   6 +
 fs/xfs/xfs_symlink.c              |  15 +-
 fs/xfs/xfs_sysctl.c               |  15 +-
 fs/xfs/xfs_sysctl.h               |   3 +-
 fs/xfs/xfs_trace.c                |   1 +
 fs/xfs/xfs_trace.h                |  50 ++++-
 fs/xfs/xfs_trans.c                | 195 +++++++++++++++++
 fs/xfs/xfs_trans.h                |  13 ++
 fs/xfs/xfs_trans_dquot.c          |  73 +++++--
 53 files changed, 1654 insertions(+), 982 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ