linux-kernel - Re: [PATCH v2 06/11] ntfsplus: add iomap and address space operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKYAXd9YW_UL2uA8anoVCw+a818y5dwtn3xAJJQc=_p32GA=Zw@mail.gmail.com>
Date: Tue, 2 Dec 2025 09:47:17 +0900
From: Namjae Jeon <linkinjeon@...nel.org>
To: Christoph Hellwig <hch@...radead.org>
Cc: viro@...iv.linux.org.uk, brauner@...nel.org, hch@....de, tytso@....edu, 
	willy@...radead.org, jack@...e.cz, djwong@...nel.org, josef@...icpanda.com, 
	sandeen@...deen.net, rgoldwyn@...e.com, xiang@...nel.org, dsterba@...e.com, 
	pali@...nel.org, ebiggers@...nel.org, neil@...wn.name, amir73il@...il.com, 
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, 
	iamjoonsoo.kim@....com, cheol.lee@....com, jay.sim@....com, gunho.lee@....com, 
	Hyunchul Lee <hyc.lee@...il.com>
Subject: Re: [PATCH v2 06/11] ntfsplus: add iomap and address space operations

On Mon, Dec 1, 2025 at 4:35 PM Christoph Hellwig <hch@...radead.org> wrote:
>
> > +#include "ntfs_iomap.h"
> > +
> > +static s64 ntfs_convert_page_index_into_lcn(struct ntfs_volume *vol, struct ntfs_inode *ni,
> > +             unsigned long page_index)
> > +{
> > +     sector_t iblock;
> > +     s64 vcn;
> > +     s64 lcn;
> > +     unsigned char blocksize_bits = vol->sb->s_blocksize_bits;
> > +
> > +     iblock = (s64)page_index << (PAGE_SHIFT - blocksize_bits);
> > +     vcn = (s64)iblock << blocksize_bits >> vol->cluster_size_bits;
>
> I've seen this calculate in quite a few places, should there be a
> generic helper for it?
Okay. I will add it.
>
> > +struct bio *ntfs_setup_bio(struct ntfs_volume *vol, blk_opf_t opf, s64 lcn,
> > +             unsigned int pg_ofs)
> > +{
> > +     struct bio *bio;
> > +
> > +     bio = bio_alloc(vol->sb->s_bdev, 1, opf, GFP_NOIO);
> > +     if (!bio)
> > +             return NULL;
>
> bio_alloc never returns NULL if it can sleep.
Okay.
>
> > +     bio->bi_iter.bi_sector = ((lcn << vol->cluster_size_bits) + pg_ofs) >>
> > +             vol->sb->s_blocksize_bits;
>
> With a helper to calculate the sector the ntfs_setup_bio helper becomes
> somewhat questionable.
Okay, I will check it.
>
> > +static int ntfs_read_folio(struct file *file, struct folio *folio)
> > +{
> > +     loff_t i_size;
> > +     struct inode *vi;
> > +     struct ntfs_inode *ni;
> > +
> > +     vi = folio->mapping->host;
> > +     i_size = i_size_read(vi);
> > +     /* Is the page fully outside i_size? (truncate in progress) */
> > +     if (unlikely(folio->index >= (i_size + PAGE_SIZE - 1) >>
> > +                     PAGE_SHIFT)) {
> > +             folio_zero_segment(folio, 0, PAGE_SIZE);
> > +             ntfs_debug("Read outside i_size - truncated?");
> > +             folio_mark_uptodate(folio);
> > +             folio_unlock(folio);
> > +             return 0;
> > +     }
>
> iomap should be taking care of this, why do you need the extra
> handling?
This is a leftover from old ntfs, so I will remove it.
>
> > +     /*
> > +      * This can potentially happen because we clear PageUptodate() during
> > +      * ntfs_writepage() of MstProtected() attributes.
> > +      */
> > +     if (folio_test_uptodate(folio)) {
> > +             folio_unlock(folio);
> > +             return 0;
> > +     }
>
> Clearing the folio uptodate flag sounds fairly dangerous, why is that
> done?
This is a leftover from old ntfs, I will check it.
>
> > +static int ntfs_write_mft_block(struct ntfs_inode *ni, struct folio *folio,
> > +             struct writeback_control *wbc)
>
> Just a very high-level comment here with no immediate action needed:
> Is there a reall good reason to use the page cache for metadata?
> Our experience with XFS is that a dedicated buffer cache is not only
> much easier to use, but also allows for much better caching.
Nothing special reason, It was to use existing ones instead of new,
complex implementations. NTFS metadata is treated as a file, and
handling it via the folio(page) API allows the driver to easily gain
performance benefits, such as readahead.
>
> > +static void ntfs_readahead(struct readahead_control *rac)
> > +{
> > +     struct address_space *mapping = rac->mapping;
> > +     struct inode *inode = mapping->host;
> > +     struct ntfs_inode *ni = NTFS_I(inode);
> > +
> > +     if (!NInoNonResident(ni) || NInoCompressed(ni)) {
> > +             /* No readahead for resident and compressed. */
> > +             return;
> > +     }
> > +
> > +     if (NInoMstProtected(ni) &&
> > +         (ni->mft_no == FILE_MFT || ni->mft_no == FILE_MFTMirr))
> > +             return;
>
> Can you comment on why readahead is skipped here?
Okay, I will add it.
>
> > +/**
> > + * ntfs_compressed_aops - address space operations for compressed inodes
> > + */
> > +const struct address_space_operations ntfs_compressed_aops = {
>
> From code in other patches is looks like ntfs never switches between
> compressed and non-compressed for live inodes?  In that case the
> separate aops should be fine, as switching between them at runtime
> would involve races.  Is the compression policy per-directory?
Non-compressed files can actually be switched to compressed files and
vice versa via setxattr at runtime. I will check the race handling
around aop switching again. And the compression policy is per-file,
not per-directory.
>
> > +             kaddr = kmap_local_folio(folio, 0);
> > +             offset = (loff_t)idx << PAGE_SHIFT;
> > +             to = min_t(u32, end - offset, PAGE_SIZE);
> > +
> > +             memcpy(buf + buf_off, kaddr + from, to);
> > +             buf_off += to;
> > +             kunmap_local(kaddr);
> > +             folio_put(folio);
> > +     }
>
> Would this be a candidate for memcpy_from_folio?
Right, I will change it.
>
> > +             kaddr = kmap_local_folio(folio, 0);
> > +             offset = (loff_t)idx << PAGE_SHIFT;
> > +             to = min_t(u32, end - offset, PAGE_SIZE);
> > +
> > +             memcpy(kaddr + from, buf + buf_off, to);
> > +             buf_off += to;
> > +             kunmap_local(kaddr);
> > +             folio_mark_uptodate(folio);
> > +             folio_mark_dirty(folio);
>
> And memcpy_to_folio?
Okay, I will change it.
>
> > +++ b/fs/ntfsplus/ntfs_iomap.c
>
> Any reason for the ntfs_ prefix here?
No reason, I will change it to iomap.c
>
> > +static void ntfs_iomap_put_folio(struct inode *inode, loff_t pos,
> > +             unsigned int len, struct folio *folio)
> > +{
>
> This seems to basically be entirely about extra zeroing.  Can you
> explain why this is needed in a comment?
Okay, I will add a comment for this.
>
> > +static int ntfs_read_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
> > +             unsigned int flags, struct iomap *iomap, struct iomap *srcmap)
> > +{
> > +     struct ntfs_inode *base_ni, *ni = NTFS_I(inode);
> > +     struct ntfs_attr_search_ctx *ctx;
> > +     loff_t i_size;
> > +     u32 attr_len;
> > +     int err = 0;
> > +     char *kattr;
> > +     struct page *ipage;
> > +
> > +     if (NInoNonResident(ni)) {
>
> Can you split the resident and non-resident cases into separate
> helpers to keep this easier to follow?
> easier to follow?
Okay. I will.
>
> > +     ipage = alloc_page(__GFP_NOWARN | __GFP_IO | __GFP_ZERO);
> > +     if (!ipage) {
> > +             err = -ENOMEM;
> > +             goto out;
> > +     }
> > +
> > +     memcpy(page_address(ipage), kattr, attr_len);
>
> Is there a reason for this being a page allocation vs a kmalloc
> sized to the inline data?
No reason, I will change it to kmalloc sized.
>
> > +static int ntfs_buffered_zeroed_clusters(struct inode *vi, s64 vcn)
>
> I think this should be ntfs_buffered_zero_clusters as it
> performans the action?
Okay. I will change it.
>
> Also curious why this can't use the existing iomap zeroing helper?
I will check it.
>
> > +int ntfs_zeroed_clusters(struct inode *vi, s64 lcn, s64 num)
>
> ntfs_zero_clusters
Okay.
>
> Again curious why we need special zeroing code in the file system.
To prevent reading garbage data after a new cluster allocation, we
must zero out the cluster. The cluster size can be up to 2MB, I will
check if that's possible through iomap.
>
> > +     if (NInoNonResident(ni)) {
>
> Another case for splitting the resident/non-resident code instead
> of having a giant conditional block that just returns.
Okay. Thanks for your review!
>