lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 14 Nov 2019 13:40:38 +0000
From:   David Howells <dhowells@...hat.com>
To:     Christoph Hellwig <hch@....de>, Dave Chinner <dchinner@...hat.com>,
        "Theodore Ts'o" <tytso@....edu>
Cc:     dhowells@...hat.com, Alexander Viro <viro@...iv.linux.org.uk>,
        v9fs-developer@...ts.sourceforge.net,
        linux-afs@...ts.infradead.org, linux-cifs@...r.kernel.org,
        linux-cachefs@...hat.com, ceph-devel@...r.kernel.org,
        linux-nfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: How to avoid using bmap in cachefiles -- FS-Cache/CacheFiles rewrite

Hi Christoph,

I've been rewriting cachefiles in the kernel and it now uses kiocbs to do
async direct I/O to/from the cache files - which seems to make a 40-48% speed
improvement.

However, I've replaced the use of bmap internally to detect whether data is
present or not - which is dodgy for a number of reasons, not least that
extent-based filesystems might insert or remove blocks of zeros to shape the
extents better, thereby rendering the metadata information useless for
cachefiles.

But using a separate map has a couple of problems:

 (1) The map is metadata kept outside of the filesystem journal, so coherency
     management is necessary

 (2) The map gets hard to manage for very large files (I'm using 256KiB
     granules, so 1 bit per granule means a 512-byte map block can span 1GiB)
     and xattrs can be of limited capacity.

I seem to remember you said something along the lines of it being possible to
tell the filesystem not to do discarding and insertion of blocks of zeros.  Is
there a generic way to do that?

Also, is it possible to make it so that I can tell an O_DIRECT read to fail
partially or, better, completely if there's no data to be had in part of the
range?  I can see DIO_SKIP_HOLES, but that only seems to affect writes

Thanks,
David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ