linux-kernel - Re: [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed & bdev-based mounts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a78509f0-f333-4a53-a618-2f05a53ff91b@huawei.com>
Date: Tue, 13 May 2025 21:59:58 +0800
From: Hongbo Li <lihongbo22@...wei.com>
To: Sheng Yong <shengyong2021@...il.com>, <xiang@...nel.org>,
	<chao@...nel.org>, <zbestahu@...il.com>, <jefflexu@...ux.alibaba.com>,
	<dhavale@...gle.com>
CC: <linux-erofs@...ts.ozlabs.org>, <linux-kernel@...r.kernel.org>, Sheng Yong
	<shengyong1@...omi.com>, Wang Shuai <wangshuai12@...omi.com>
Subject: Re: [PATCH v5 1/2] erofs: add 'fsoffset' mount option for file-backed
 & bdev-based mounts



On 2025/5/13 21:56, Hongbo Li wrote:
> 
> 
> On 2025/5/13 19:34, Sheng Yong wrote:
>> From: Sheng Yong <shengyong1@...omi.com>
>>
>> When attempting to use an archive file, such as APEX on android,
>> as a file-backed mount source, it fails because EROFS image within
>> the archive file does not start at offset 0. As a result, a loop
>> device is still needed to attach the image file at an appropriate
>> offset first. Similarly, if an EROFS image within a block device
>> does not start at offset 0, it cannot be mounted directly either.
>>
>> To address this issue, this patch adds a new mount option `fsoffset=x'
>> to accept a start offset for both file-backed and bdev-based mounts.
>> The offset should be aligned to block size. EROFS will add this offset
>> before performing read requests.
>>
>> Signed-off-by: Sheng Yong <shengyong1@...omi.com>
>> Signed-off-by: Wang Shuai <wangshuai12@...omi.com>
>> ---
>>   Documentation/filesystems/erofs.rst |  1 +
>>   fs/erofs/data.c                     |  8 ++++++--
>>   fs/erofs/fileio.c                   |  3 ++-
>>   fs/erofs/internal.h                 |  2 ++
>>   fs/erofs/super.c                    | 12 +++++++++++-
>>   fs/erofs/zdata.c                    |  3 ++-
>>   6 files changed, 24 insertions(+), 5 deletions(-)
>> ---
>> v5: * fix fsoffset on multiple device by adding off when creating io
>>        request, erofs_map_device selects the target device an only
>>        primary device has an off
>>      * remove unnecessary checks of fsoffset value
>>      * try to combine off and dax_part_off, but it is not easy to do
>>        that, because dax_part_off is not needed when reading metadata
>>
>> v4: * change mount option `offset=x' to `fsoffset=x'
>> https://lore.kernel.org/linux-erofs/c5110e03-90ea-40be-b05f-bc920332a1e1@linux.alibaba.com
>>
>> v3: * rename `offs' to `off'
>>      * parse offset using fsparam_u64 and validate it in fill_super
>>      * update bi_sector inline
>>      
>> https://lore.kernel.org/linux-erofs/98585dd8-d0b6-4000-b46d-a08c64eae44d@linux.alibaba.com
>>
>> v2: * add a new mount option `offset=X' for start offset, and offset
>>         should be aligned to PAGE_SIZE
>>      * add start offset for both file-backed and bdev-based mounts
>>      
>> https://lore.kernel.org/linux-erofs/0725c2ec-528c-42a8-9557-4713e7e35153@linux.alibaba.com
>>
>> v1: 
>> https://lore.kernel.org/all/20250324022849.2715578-1-shengyong1@xiaomi.com/
>>
>> diff --git a/Documentation/filesystems/erofs.rst 
>> b/Documentation/filesystems/erofs.rst
>> index c293f8e37468..0fa4c7826203 100644
>> --- a/Documentation/filesystems/erofs.rst
>> +++ b/Documentation/filesystems/erofs.rst
>> @@ -128,6 +128,7 @@ device=%s              Specify a path to an extra 
>> device to be used together.
>>   fsid=%s                Specify a filesystem image ID for Fscache 
>> back-end.
>>   domain_id=%s           Specify a domain ID in fscache mode so that 
>> different images
>>                          with the same blobs under a given domain ID 
>> can share storage.
>> +fsoffset=%s            Specify image offset for file-backed or 
>> bdev-based mounts.
Hi, Yong

fsoffset should be formatted with %lu ?

Thanks,
Hongbo

>>   ===================    
>> =========================================================
>>   Sysfs Entries
>> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
>> index 2409d2ab0c28..599a44d5d782 100644
>> --- a/fs/erofs/data.c
>> +++ b/fs/erofs/data.c
>> @@ -27,9 +27,12 @@ void erofs_put_metabuf(struct erofs_buf *buf)
>>   void *erofs_bread(struct erofs_buf *buf, erofs_off_t offset, bool 
>> need_kmap)
>>   {
>> -    pgoff_t index = offset >> PAGE_SHIFT;
>> +    pgoff_t index;
>>       struct folio *folio = NULL;
>> +    offset += buf->off;
>> +    index = offset >> PAGE_SHIFT;
>> +
>>       if (buf->page) {
>>           folio = page_folio(buf->page);
>>           if (folio_file_page(folio, index) != buf->page)
>> @@ -54,6 +57,7 @@ void erofs_init_metabuf(struct erofs_buf *buf, 
>> struct super_block *sb)
>>       struct erofs_sb_info *sbi = EROFS_SB(sb);
>>       buf->file = NULL;
>> +    buf->off = sbi->dif0.off;
>>       if (erofs_is_fileio_mode(sbi)) {
>>           buf->file = sbi->dif0.file;    /* some fs like FUSE needs it */
>>           buf->mapping = buf->file->f_mapping;
>> @@ -299,7 +303,7 @@ static int erofs_iomap_begin(struct inode *inode, 
>> loff_t offset, loff_t length,
>>           iomap->private = buf.base;
>>       } else {
>>           iomap->type = IOMAP_MAPPED;
>> -        iomap->addr = mdev.m_pa;
>> +        iomap->addr = mdev.m_dif->off + mdev.m_pa;
>>           if (flags & IOMAP_DAX)
>>               iomap->addr += mdev.m_dif->dax_part_off;
>>       }
>> diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c
>> index 60c7cc4c105c..a2c7001ff789 100644
>> --- a/fs/erofs/fileio.c
>> +++ b/fs/erofs/fileio.c
>> @@ -147,7 +147,8 @@ static int erofs_fileio_scan_folio(struct 
>> erofs_fileio *io, struct folio *folio)
>>                   if (err)
>>                       break;
>>                   io->rq = erofs_fileio_rq_alloc(&io->dev);
>> -                io->rq->bio.bi_iter.bi_sector = io->dev.m_pa >> 9;
>> +                io->rq->bio.bi_iter.bi_sector =
>> +                    (io->dev.m_dif->off + io->dev.m_pa) >> 9;
>>                   attached = 0;
>>               }
>>               if (!bio_add_folio(&io->rq->bio, folio, len, cur))
>> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
>> index 4ac188d5d894..10656bd986bd 100644
>> --- a/fs/erofs/internal.h
>> +++ b/fs/erofs/internal.h
>> @@ -43,6 +43,7 @@ struct erofs_device_info {
>>       char *path;
>>       struct erofs_fscache *fscache;
>>       struct file *file;
>> +    loff_t off;
> 
> Use u64 is better?
> 
>>       struct dax_device *dax_dev;
>>       u64 dax_part_off;
>> @@ -199,6 +200,7 @@ enum {
>>   struct erofs_buf {
>>       struct address_space *mapping;
>>       struct file *file;
>> +    loff_t off;
> 
> Same here.
> 
>>       struct page *page;
>>       void *base;
>>   };
>> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
>> index da6ee7c39290..512877d7d855 100644
>> --- a/fs/erofs/super.c
>> +++ b/fs/erofs/super.c
>> @@ -356,7 +356,7 @@ static void erofs_default_options(struct 
>> erofs_sb_info *sbi)
>>   enum {
>>       Opt_user_xattr, Opt_acl, Opt_cache_strategy, Opt_dax, Opt_dax_enum,
>> -    Opt_device, Opt_fsid, Opt_domain_id, Opt_directio,
>> +    Opt_device, Opt_fsid, Opt_domain_id, Opt_directio, Opt_fsoffset,
>>   };
>>   static const struct constant_table erofs_param_cache_strategy[] = {
>> @@ -383,6 +383,7 @@ static const struct fs_parameter_spec 
>> erofs_fs_parameters[] = {
>>       fsparam_string("fsid",        Opt_fsid),
>>       fsparam_string("domain_id",    Opt_domain_id),
>>       fsparam_flag_no("directio",    Opt_directio),
>> +    fsparam_u64("fsoffset",        Opt_fsoffset),
>>       {}
>>   };
>> @@ -506,6 +507,9 @@ static int erofs_fc_parse_param(struct fs_context 
>> *fc,
>>           errorfc(fc, "%s option not supported", 
>> erofs_fs_parameters[opt].name);
>>   #endif
>>           break;
>> +    case Opt_fsoffset:
>> +        sbi->dif0.off = result.uint_64;
>> +        break;
>>       }
>>       return 0;
>>   }
>> @@ -599,6 +603,10 @@ static int erofs_fc_fill_super(struct super_block 
>> *sb, struct fs_context *fc)
>>                   &sbi->dif0.dax_part_off, NULL, NULL);
>>       }
>> +    if (sbi->dif0.off & ((1 << sbi->blkszbits) - 1))
>> +        return invalfc(fc, "fsoffset %lld not aligned to block size",
>> +                   sbi->dif0.off);
>> +
>>       err = erofs_read_superblock(sb);
>>       if (err)
>>           return err;
>> @@ -947,6 +955,8 @@ static int erofs_show_options(struct seq_file 
>> *seq, struct dentry *root)
>>       if (sbi->domain_id)
>>           seq_printf(seq, ",domain_id=%s", sbi->domain_id);
>>   #endif
>> +    if (sbi->dif0.off)
>> +        seq_printf(seq, ",fsoffset=%lld", sbi->dif0.off);
>>       return 0;
>>   }
>> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
>> index b8e6b76c23d5..4f910d7ffb2f 100644
>> --- a/fs/erofs/zdata.c
>> +++ b/fs/erofs/zdata.c
>> @@ -1707,7 +1707,8 @@ static void z_erofs_submit_queue(struct 
>> z_erofs_frontend *f,
>>                       bio = bio_alloc(mdev.m_bdev, BIO_MAX_VECS,
>>                               REQ_OP_READ, GFP_NOIO);
>>                   bio->bi_end_io = z_erofs_endio;
>> -                bio->bi_iter.bi_sector = cur >> 9;
>> +                bio->bi_iter.bi_sector =
>> +                        (mdev.m_dif->off + cur) >> 9;
>>                   bio->bi_private = q[JQ_SUBMIT];
>>                   if (readahead)
>>                       bio->bi_opf |= REQ_RAHEAD;