[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a8bc6938-84d9-42d6-9928-7cdd13e3a4c8@linux.alibaba.com>
Date: Thu, 8 Jan 2026 17:31:40 +0800
From: Gao Xiang <hsiangkao@...ux.alibaba.com>
To: Zhiguo Niu <niuzhiguo84@...il.com>
Cc: linux-erofs@...ts.ozlabs.org, LKML <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Dusty Mabe <dusty@...tymabe.com>, Timothée Ravier
<tim@...sm.fr>, Alekséi Naidénov <an@...italtide.io>,
Amir Goldstein <amir73il@...il.com>, Alexander Larsson <alexl@...hat.com>,
Christian Brauner <brauner@...nel.org>, Miklos Szeredi
<mszeredi@...hat.com>, Sheng Yong <shengyong1@...omi.com>
Subject: Re: [PATCH v3 RESEND] erofs: don't bother with s_stack_depth
increasing for now
On 2026/1/8 17:28, Zhiguo Niu wrote:
> Gao Xiang <hsiangkao@...ux.alibaba.com> 于2026年1月8日周四 11:07写道:
>>
>> Previously, commit d53cd891f0e4 ("erofs: limit the level of fs stacking
>> for file-backed mounts") bumped `s_stack_depth` by one to avoid kernel
>> stack overflow when stacking an unlimited number of EROFS on top of
>> each other.
>>
>> This fix breaks composefs mounts, which need EROFS+ovl^2 sometimes
>> (and such setups are already used in production for quite a long time).
>>
>> One way to fix this regression is to bump FILESYSTEM_MAX_STACK_DEPTH
>> from 2 to 3, but proving that this is safe in general is a high bar.
>>
>> After a long discussion on GitHub issues [1] about possible solutions,
>> one conclusion is that there is no need to support nesting file-backed
>> EROFS mounts on stacked filesystems, because there is always the option
>> to use loopback devices as a fallback.
>>
>> As a quick fix for the composefs regression for this cycle, instead of
>> bumping `s_stack_depth` for file backed EROFS mounts, we disallow
>> nesting file-backed EROFS over EROFS and over filesystems with
>> `s_stack_depth` > 0.
>>
>> This works for all known file-backed mount use cases (composefs,
>> containerd, and Android APEX for some Android vendors), and the fix is
>> self-contained.
>>
>> Essentially, we are allowing one extra unaccounted fs stacking level of
>> EROFS below stacking filesystems, but EROFS can only be used in the read
>> path (i.e. overlayfs lower layers), which typically has much lower stack
>> usage than the write path.
>>
>> We can consider increasing FILESYSTEM_MAX_STACK_DEPTH later, after more
>> stack usage analysis or using alternative approaches, such as splitting
>> the `s_stack_depth` limitation according to different combinations of
>> stacking.
>>
>> Fixes: d53cd891f0e4 ("erofs: limit the level of fs stacking for file-backed mounts")
>> Reported-and-tested-by: Dusty Mabe <dusty@...tymabe.com>
>> Reported-by: Timothée Ravier <tim@...sm.fr>
>> Closes: https://github.com/coreos/fedora-coreos-tracker/issues/2087 [1]
>> Reported-by: "Alekséi Naidénov" <an@...italtide.io>
>> Closes: https://lore.kernel.org/r/CAFHtUiYv4+=+JP_-JjARWjo6OwcvBj1wtYN=z0QXwCpec9sXtg@mail.gmail.com
>> Acked-by: Amir Goldstein <amir73il@...il.com>
>> Acked-by: Alexander Larsson <alexl@...hat.com>
>> Cc: Christian Brauner <brauner@...nel.org>
>> Cc: Miklos Szeredi <mszeredi@...hat.com>
>> Cc: Sheng Yong <shengyong1@...omi.com>
>> Cc: Zhiguo Niu <niuzhiguo84@...il.com>
>> Signed-off-by: Gao Xiang <hsiangkao@...ux.alibaba.com>
>> ---
>> v2->v3 RESEND:
>> - Exclude bdev-backed EROFS mounts since it will be a real terminal fs
>> as pointed out by Sheng Yong (APEX will rely on this);
>>
>> - Preserve previous "Acked-by:" and "Tested-by:" since it's trivial.
>>
>> fs/erofs/super.c | 19 +++++++++++++------
>> 1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
>> index 937a215f626c..5136cda5972a 100644
>> --- a/fs/erofs/super.c
>> +++ b/fs/erofs/super.c
>> @@ -644,14 +644,21 @@ static int erofs_fc_fill_super(struct super_block *sb, struct fs_context *fc)
>> * fs contexts (including its own) due to self-controlled RO
>> * accesses/contexts and no side-effect changes that need to
>> * context save & restore so it can reuse the current thread
>> - * context. However, it still needs to bump `s_stack_depth` to
>> - * avoid kernel stack overflow from nested filesystems.
>> + * context.
>> + * However, we still need to prevent kernel stack overflow due
>> + * to filesystem nesting: just ensure that s_stack_depth is 0
>> + * to disallow mounting EROFS on stacked filesystems.
>> + * Note: s_stack_depth is not incremented here for now, since
>> + * EROFS is the only fs supporting file-backed mounts for now.
>> + * It MUST change if another fs plans to support them, which
>> + * may also require adjusting FILESYSTEM_MAX_STACK_DEPTH.
>> */
>> if (erofs_is_fileio_mode(sbi)) {
>> - sb->s_stack_depth =
>> - file_inode(sbi->dif0.file)->i_sb->s_stack_depth + 1;
>> - if (sb->s_stack_depth > FILESYSTEM_MAX_STACK_DEPTH) {
>> - erofs_err(sb, "maximum fs stacking depth exceeded");
>> + inode = file_inode(sbi->dif0.file);
>> + if ((inode->i_sb->s_op == &erofs_sops &&
>> + !inode->i_sb->s_bdev) ||
>> + inode->i_sb->s_stack_depth) {
>> + erofs_err(sb, "file-backed mounts cannot be applied to stacked fses");
> Hi Xiang
> Do we need to print s_stack_depth here to distinguish which specific
> problem case it is?
.. I don't want to complex it (since it's just a short-term
solution and erofs is unaccounted so s_stack_depth really
mean nothing) unless it's really needed for Android vendors?
> Other LGTM based on my basic test. so
>
> Reviewed-by: Zhiguo Niu <zhiguo.niu@...soc.com>
Thanks for this too.
Thanks,
Gao Xiang
> Thanks!
>> return -ENOTBLK;
>> }
>> }
>> --
>> 2.43.5
>>
Powered by blists - more mailing lists