lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 27 May 2021 12:35:17 +0200
From:   Jan Kara <jack@...e.cz>
To:     Roman Gushchin <guro@...com>
Cc:     Jan Kara <jack@...e.cz>, Tejun Heo <tj@...nel.org>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, Alexander Viro <viro@...iv.linux.org.uk>,
        Dennis Zhou <dennis@...nel.org>,
        Dave Chinner <dchinner@...hat.com>, cgroups@...r.kernel.org
Subject: Re: [PATCH v5 1/2] writeback, cgroup: keep list of inodes attached
 to bdi_writeback

On Wed 26-05-21 15:25:56, Roman Gushchin wrote:
> Currently there is no way to iterate over inodes attached to a
> specific cgwb structure. It limits the ability to efficiently
> reclaim the writeback structure itself and associated memory and
> block cgroup structures without scanning all inodes belonging to a sb,
> which can be prohibitively expensive.
> 
> While dirty/in-active-writeback an inode belongs to one of the
> bdi_writeback's io lists: b_dirty, b_io, b_more_io and b_dirty_time.
> Once cleaned up, it's removed from all io lists. So the
> inode->i_io_list can be reused to maintain the list of inodes,
> attached to a bdi_writeback structure.
> 
> This patch introduces a new wb->b_attached list, which contains all
> inodes which were dirty at least once and are attached to the given
> cgwb. Inodes attached to the root bdi_writeback structures are never
> placed on such list. The following patch will use this list to try to
> release cgwbs structures more efficiently.
> 
> Suggested-by: Jan Kara <jack@...e.cz>
> Signed-off-by: Roman Gushchin <guro@...com>

Looks good. Just some minor nits below:

> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index e91980f49388..631ef6366293 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -135,18 +135,23 @@ static bool inode_io_list_move_locked(struct inode *inode,
>   * inode_io_list_del_locked - remove an inode from its bdi_writeback IO list
>   * @inode: inode to be removed
>   * @wb: bdi_writeback @inode is being removed from
> + * @final: inode is going to be freed and can't reappear on any IO list
>   *
>   * Remove @inode which may be on one of @wb->b_{dirty|io|more_io} lists and
>   * clear %WB_has_dirty_io if all are empty afterwards.
>   */
>  static void inode_io_list_del_locked(struct inode *inode,
> -				     struct bdi_writeback *wb)
> +				     struct bdi_writeback *wb,
> +				     bool final)
>  {
>  	assert_spin_locked(&wb->list_lock);
>  	assert_spin_locked(&inode->i_lock);
>  
>  	inode->i_state &= ~I_SYNC_QUEUED;
> -	list_del_init(&inode->i_io_list);
> +	if (final)
> +		list_del_init(&inode->i_io_list);
> +	else
> +		inode_cgwb_move_to_attached(inode, wb);
>  	wb_io_lists_depopulated(wb);
>  }

With these changes the naming is actually somewhat confusing and the bool
argument makes it even worse. Looking into the code I'd just fold
inode_io_list_del_locked() into inode_io_list_del() and make it really
delete inode from all IO lists. There are currently three other
inode_io_list_del_locked() users:

requeue_inode(), writeback_single_inode() - these should just call
inode_cgwb_move_to_attached() unconditionally
(inode_cgwb_move_to_attached() just needs to clear I_SYNC_QUEUED and call
wb_io_lists_depopulated() in addition to what it currently does).

inode_switch_wbs_work_fn() - I don't think it needs
inode_io_list_del_locked() at all. See below...

> @@ -278,6 +283,25 @@ void __inode_attach_wb(struct inode *inode, struct page *page)
>  }
>  EXPORT_SYMBOL_GPL(__inode_attach_wb);
>  
> +/**
> + * inode_cgwb_move_to_attached - put the inode onto wb->b_attached list
> + * @inode: inode of interest with i_lock held
> + * @wb: target bdi_writeback
> + *
> + * Remove the inode from wb's io lists and if necessarily put onto b_attached
> + * list.  Only inodes attached to cgwb's are kept on this list.
> + */
> +void inode_cgwb_move_to_attached(struct inode *inode, struct bdi_writeback *wb)
> +{
> +	assert_spin_locked(&wb->list_lock);
> +	assert_spin_locked(&inode->i_lock);
> +
> +	if (wb != &wb->bdi->wb)
> +		list_move(&inode->i_io_list, &wb->b_attached);
> +	else
> +		list_del_init(&inode->i_io_list);
> +}

I think this can be static and you can drop the declarations from header
files below. At least I wasn't able to find where this would be used
outside of fs/writeback.c.

>  /**
>   * locked_inode_to_wb_and_lock_list - determine a locked inode's wb and lock it
>   * @inode: inode of interest with i_lock held
> @@ -419,21 +443,29 @@ static void inode_switch_wbs_work_fn(struct work_struct *work)
>  	wb_get(new_wb);
>  
>  	/*
> -	 * Transfer to @new_wb's IO list if necessary.  The specific list
> -	 * @inode was on is ignored and the inode is put on ->b_dirty which
> -	 * is always correct including from ->b_dirty_time.  The transfer
> -	 * preserves @inode->dirtied_when ordering.
> +	 * Transfer to @new_wb's IO list if necessary.  If the @inode is dirty,
> +	 * the specific list @inode was on is ignored and the @inode is put on
> +	 * ->b_dirty which is always correct including from ->b_dirty_time.
> +	 * The transfer preserves @inode->dirtied_when ordering.  If the @inode
> +	 * was clean, it means it was on the b_attached list, so move it onto
> +	 * the b_attached list of @new_wb.
>  	 */
>  	if (!list_empty(&inode->i_io_list)) {
> -		struct inode *pos;
> -
> -		inode_io_list_del_locked(inode, old_wb);
> +		inode_io_list_del_locked(inode, old_wb, true);

Do we need inode_io_list_del_locked() here at all? Below we are careful
enough to always use list_move() which does the deletion for us anyway.

>  		inode->i_wb = new_wb;
> -		list_for_each_entry(pos, &new_wb->b_dirty, i_io_list)
> -			if (time_after_eq(inode->dirtied_when,
> -					  pos->dirtied_when))
> -				break;
> -		inode_io_list_move_locked(inode, new_wb, pos->i_io_list.prev);
> +
> +		if (inode->i_state & I_DIRTY_ALL) {
> +			struct inode *pos;
> +
> +			list_for_each_entry(pos, &new_wb->b_dirty, i_io_list)
> +				if (time_after_eq(inode->dirtied_when,
> +						  pos->dirtied_when))
> +					break;
> +			inode_io_list_move_locked(inode, new_wb,
> +						  pos->i_io_list.prev);
> +		} else {
> +			inode_cgwb_move_to_attached(inode, new_wb);
> +		}

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ