lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 10 Dec 2009 16:02:04 +0100
From:	Jan Kara <jack@...e.cz>
To:	Dmitry Monakhov <dmonakhov@...nvz.org>
Cc:	linux-ext4@...r.kernel.org, aneesh.kumar@...ux.vnet.ibm.com,
	jack@...e.cz, cmm@...ibm.com
Subject: Re: [PATCH 1/2] quota: decouple fs reserved space from quota
 reservation. [V3]

  Hi,

On Thu 10-12-09 15:25:51, Dmitry Monakhov wrote:
> Currently inode_reservation is managed by fs itself and this
> reservation is transfered on dquot_transfer(). This means what
> inode_reservation must always be in sync with
> dquot->dq_dqb.dqb_rsvspace. Otherwise dquot_transfer() will result
> in incorrect quota(WARN_ON in dquot_claim_reserved_space() will be
> triggered)
> This is not easy because of complex locking order issues
> for example http://bugzilla.kernel.org/show_bug.cgi?id=14739
> 
> The patch introduce quota reservation field for each fs-inode
> (fs specific inode is used in order to prevent bloating generic
> vfs inode). This reservation is managed by quota code internally
> similar to i_blocks/i_bytes and may not be always in sync with
> internal fs reservation.
> 
> Also perform some code rearrangement:
> - Unify dquot_reserve_space() and dquot_reserve_space()
> - Unify dquot_release_reserved_space() and dquot_free_space()
> - Also this patch add missing warning update to release_rsv()
>   dquot_release_reserved_space() must call flush_warnings() as
>   dquot_free_space() does.
> 
> Changes from V1:
> - move qutoa_reservation field from vfs_inode to fs_inode
> - account reservation in bytes instead of blocks.
  Thanks for looking into this! I have some comments to the patch below
but generally I like it.

> Signed-off-by: Dmitry Monakhov <dmonakhov@...nvz.org>
> ---
>  fs/quota/dquot.c      |  219 ++++++++++++++++++++++++++++---------------------
>  include/linux/quota.h |    5 +-
>  2 files changed, 127 insertions(+), 97 deletions(-)
> 
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 39b49c4..66221f8 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -1388,6 +1388,71 @@ void vfs_dq_drop(struct inode *inode)
>  EXPORT_SYMBOL(vfs_dq_drop);
>  
>  /*
> + * inode_reserved_space is managed internally by quota, and protected by
> + * i_lock similar to i_blocks+i_bytes.
> + */
> +static qsize_t* inode_reserved_space(struct inode * inode)
                 ^^ it should be "qsize_t *"
> +{
> +	/* Filesystem must explicitly define it's own method in order to use
> +	 * quota reservation interface */
> +	BUG_ON(!inode->i_sb->dq_op->get_reserved_space);
> +	return inode->i_sb->dq_op->get_reserved_space(inode);
> +}
> +
> +static void inode_add_rsv_space(struct inode *inode, qsize_t number)
> +{
> +	spin_lock(&inode->i_lock);
> +	*inode_reserved_space(inode) += number;
> +	spin_unlock(&inode->i_lock);
> +}
> +
  Hmm, I think I have a better solution for this: Each inode will have a
pointer to "inode features" table. That table will contain offsets of
general structures (e.g. things needed for quotas) embedded inside
fs-specific inode. That way we can later move similar structures out of
general VFS inode to inodes of filesystems that need it. When a filesystem
creates an inode, it fills in a proper pointer to the table... That should
allow generic code access data in fs-specific part of inode structure,
we don't have to add functions returning pointers, etc.
  But for now I'd go with your solution since mine will require more
widespread changes to quota code and we need to fix that bug.

> +static void inode_claim_rsv_space(struct inode *inode, qsize_t number)
> +{
> +	spin_lock(&inode->i_lock);
> +	*inode_reserved_space(inode) -= number;
> +	inode->i_blocks += number >> 9;
> +	number &= 511;
> +	inode->i_bytes += number;
> +	if (inode->i_bytes >= 512) {
> +		inode->i_blocks++;
> +		inode->i_bytes -= 512;
> +	}
> +	spin_unlock(&inode->i_lock);
> +}
  Please add variant of inode_add_bytes() which does not acquire
inode->i_lock and use it here. I'd suggest __inode_add_bytes and rename
__inode_add_bytes below to something more appropriate like
inode_alloc_reserve_bytes or so.

...
> @@ -1426,64 +1506,32 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number,
>  		else
>  			dquot_incr_space(inode->i_dquot[cnt], number);
>  	}
> -	if (!reserve)
> -		inode_add_bytes(inode, number);
> -out_unlock:
> -	spin_unlock(&dq_data_lock);
> -	flush_warnings(inode->i_dquot, warntype);
> -	return ret;
> -}
> -
> -int dquot_alloc_space(struct inode *inode, qsize_t number, int warn)
> -{
> -	int cnt, ret = QUOTA_OK;
> -
> -	/*
> -	 * First test before acquiring mutex - solves deadlocks when we
> -	 * re-enter the quota code and are already holding the mutex
> -	 */
> -	if (IS_NOQUOTA(inode)) {
> -		inode_add_bytes(inode, number);
> -		goto out;
> -	}
> -
> -	down_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
> -	if (IS_NOQUOTA(inode)) {
> -		inode_add_bytes(inode, number);
> -		goto out_unlock;
> -	}
> -
> -	ret = __dquot_alloc_space(inode, number, warn, 0);
> -	if (ret == NO_QUOTA)
> -		goto out_unlock;
> +	__inode_add_bytes(inode, number, reserve);
>  
> +	if (reserve)
> +		goto out_data_unlock;
>  	/* Dirtify all the dquots - this can block when journalling */
>  	for (cnt = 0; cnt < MAXQUOTAS; cnt++)
>  		if (inode->i_dquot[cnt])
>  			mark_dquot_dirty(inode->i_dquot[cnt]);
> +out_data_unlock:
> +	spin_unlock(&dq_data_lock);
> +	flush_warnings(inode->i_dquot, warntype);
  You have to unlock dq_data_lock before mark_dquot_dirty as it can sleep.

>  out_unlock:
>  	up_read(&sb_dqopt(inode->i_sb)->dqptr_sem);
>  out:
>  	return ret;
>  }
...
> @@ -1854,6 +1882,7 @@ const struct dquot_operations dquot_operations = {
>  	.write_info	= dquot_commit_info,
>  	.alloc_dquot	= dquot_alloc,
>  	.destroy_dquot	= dquot_destroy,
> +
>  };
  I guess the above is a mistake...

> diff --git a/include/linux/quota.h b/include/linux/quota.h
> index 78c4889..40845f1 100644
> --- a/include/linux/quota.h
> +++ b/include/linux/quota.h
> @@ -313,8 +313,9 @@ struct dquot_operations {
>  	int (*claim_space) (struct inode *, qsize_t);
>  	/* release rsved quota for delayed alloc */
>  	void (*release_rsv) (struct inode *, qsize_t);
> -	/* get reserved quota for delayed alloc */
> -	qsize_t (*get_reserved_space) (struct inode *);
> +	/* get reserved quota for delayed alloc, value returned is managed by
> +	 * quota code only */
> +	qsize_t* (*get_reserved_space) (struct inode *);
  I think proper coding style is "qsize_t *(*get_reserved_space) (...)"

									Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ