[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141212095230.GB4813@quack.suse.cz>
Date: Fri, 12 Dec 2014 10:52:30 +0100
From: Jan Kara <jack@...e.cz>
To: Dave Chinner <david@...morbit.com>
Cc: Jan Kara <jack@...e.cz>, linux-fsdevel@...r.kernel.org,
linux-ext4@...r.kernel.org, xfs@....sgi.com
Subject: Re: [PATCH 10/11] quota: Switch ->get_dqblk() and ->set_dqblk() to
use bytes as space units
On Wed 19-11-14 09:29:52, Dave Chinner wrote:
> On Tue, Nov 11, 2014 at 10:04:24PM +0100, Jan Kara wrote:
> > Currently ->get_dqblk() and ->set_dqblk() use struct fs_disk_quota which
> > tracks space limits and usage in 512-byte blocks. However VFS quotas
> > track usage in bytes (as some filesystems require that) and we need to
> > somehow pass this information. Upto now it wasn't a problem because we
> > didn't do any unit conversion (thus VFS quota routines happily stuck
> > number of bytes into d_bcount field of struct fd_disk_quota). Only if
> > you tried to use Q_XGETQUOTA or Q_XSETQLIM for VFS quotas (or Q_GETQUOTA
> > / Q_SETQUOTA for XFS quotas), you got bogus results but noone really
> > tried that. But when we want interfaces compatible we need to fix this.
> >
> > So we bite the bullet and define another quota structure used for
> > passing information from/to ->get_dqblk()/->set_dqblk. It's somewhat
> > sad we have to have more conversion routines in fs/quota/quota.c but
> > it seems cleaner than overloading e.g. units of d_bcount to bytes.
>
> I don't really like the idea of having to copy the dquot information
> an extra time. We now:
>
> - copy from internal dquot to the new qc_dqblk
> - copy from the new qc_dqblk to if_dqblk/xfs_dqblk
> - copy if_dqblk/xfs_dqblk to the user buffer.
>
> That's now three copies, and when we are having to deal with quota
> reports containing hundreds of thousands of dquots that's going to
> hrut performance.
>
> We could probably get away with just one copy by passing a
> filldir()-like context down into the filesystems to format their
> internal dquot information directly into the user buffer in the
> appropriate format. That way fs/quota/quota.c doesn't need
> conversion routines, filesystems can optimise the formating to
> minimise copying, and we can still provide generic routines for
> filesystems using the generic quota infrastructure....
I was thinking about how this would look like. I don't have a problem to
create a filldir() like callback that will be used for getting quota
structures. However I don't see how we could reasonably get away with just
one copy in general - that would mean that the interface functions in
fs/quota.c (e.g. quota_getquota()) would have to determine whether XFS of
VFS quota structures are used in the backing filesystem to provide proper
callback and that's IMO too ugly to live.
We could definitely reduce the number of copies to two by changing e.g.
copy_to_xfs_dqblk() to directly use __put_user() instead of first
formatting proper structure on stack and then using copy_to_user(). However
I'm not sure whether this will be any real performance win and using
copy_to_user() seems easier to me...
Anyway I'll probably try changing number of copies to two and see whether
there's any measurable impact.
> > @@ -277,10 +287,73 @@ static int quota_getxstatev(struct super_block *sb, void __user *addr)
> > return ret;
> > }
> >
> > +/* XFS blocks to space conversion */
> > +static u64 xfs_btos(u64 blocks)
> > +{
> > + return blocks << 9;
> > +}
> > +
> > +/* Space to XFS blocks conversion */
> > +static u64 xfs_stob(u64 space)
> > +{
> > + return (space + 511) >> 9;
> > +}
>
> Which is just redefining BBTOB() [Basic Blocks to Bytes] and BTOBB()
> with magic numbers and no explanation of what a "block" actually is.
> We already export those macros to userspace, and they are used by
> xfs_quota to convert the block counts to bytes....
Sure, I will use XFS macros.
Honza
--
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists