lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220331212152.GG1544202@dread.disaster.area>
Date:   Fri, 1 Apr 2022 08:21:52 +1100
From:   Dave Chinner <david@...morbit.com>
To:     wang.yi59@....com.cn
Cc:     djwong@...nel.org, linux-xfs@...r.kernel.org,
        linux-kernel@...r.kernel.org, xue.zhihong@....com.cn,
        wang.liang82@....com.cn, cheng.lin130@....com.cn
Subject: Re: [PATCH] xfs: getattr ignore blocks beyond eof

On Thu, Mar 31, 2022 at 04:32:07PM +0800, wang.yi59@....com.cn wrote:
> > We do not, and have not ever tried to, hide allocation or block
> > usage artifacts from userspace because any application that depends
> > on specific block allocation patterns or accounting from the
> > filesystem is broken by design.
> >
> > Every filesystem accounts blocks differently, and more often than
> > not the block count exposed to userspace also includes metadata
> > blocks (extent maps, xattr blocks, etc) and it might multiple count
> > other blocks (e.g. shared extents). Hence so you can't actually
> > use it for anything useful in userspace except reporting how many
> > blocks this file *might* use.
> >
> > If your application is dependent on block counts exactly matching
> > the file data space for waht ever reason, then what speculative
> > preallocation does is the least of your problems.
> >
> 
> Thanks for your explaination.
> 
> Unfortunately, the app I'm using evaluates diskusage by querying
> the changes of the backend filesystem (XFS) file before and after
> the operation.

What application is this?

What is it trying to use this information for?

I'm trying to understand why someone thought this was a good idea,
and without actually being able to look up the code and see what it
is using the information for, I can't really say much more than
"this seems broken by design".

> Without giving up the benefits of preallocation, the
> app's statistics will become obsolete and no chance to correct it
> at a small cost, because of the silence reclaim of posteof blocks.
> That is the app's problem.

Yes it is.

> Posteof blocks will be reclaimed sooner or later, it seems reasonable

No, that is not guaranteed. If you the extend the file again, those
post eof blocks will no longer be post-eof blocks and instead
contain user data. Also, fallocate() can allocate post-eof blocks,
and in that case they can be retained permanently because the user
asked them to be placed beyond EOF.

So the assertion that post-eof blocks always get removed sooner or
later is not actually true.

> to ignore them directly during query. This is my humble opinion in
> this patch. At the query moment, it's not real, but it will become so
> eventually. It's a speculative result for query.

No, it's the _correct_ result for the current state of the file
being queried. The statx() man page says:

st_blocks
     This field indicates the number of blocks allocated to the
     file, in 512-byte units.  (This may be smaller than st_size/512
     when the file has holes.)

The POSIX specification just defines it as "Number of blocks
allocated for this object."

Neither say anything about how the filesystem should or shouldn't
account those blocks, that it must be stable, that it must reflect
the amount of data written to the file, etc. ALl they say is that
it is the amount of blocks allocated for that file.

As it is, hiding space usage like you propose is likely to cause
more problems than it solaves, because not du will not report all
the disk space used by a file and hence we'll end up with other
users reporting that the disk space reported by du does not match up
with the space the filesytem is using. Which, of course is also
expected, because reflink/dedupe result in du multiple counting
shared blocks.

IOWs, userspace tracking and aggregation of filesystem space usage
just doesn't work, and so papering over behaviours that expose the
fact it doesn't and can't work are in no-ones best interests.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ