lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fureginotssirocugn3aznor4vhbpadhwy7fhaxzeullhrzp7y@bg5gzdv6mrif>
Date: Mon, 8 Sep 2025 15:54:38 +0200
From: Jan Kara <jack@...e.cz>
To: Joseph Qi <joseph.qi@...ux.alibaba.com>
Cc: Jan Kara <jack@...e.cz>, Mateusz Guzik <mjguzik@...il.com>, 
	Mark Tinguely <mark.tinguely@...cle.com>, ocfs2-devel@...ts.linux.dev, viro@...iv.linux.org.uk, 
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org, josef@...icpanda.com, 
	jlbec@...lplan.org, mark@...heh.com, brauner@...nel.org, willy@...radead.org, 
	david@...morbit.com
Subject: Re: [External] : [PATCH] ocfs2: retire ocfs2_drop_inode() and
 I_WILL_FREE usage

On Mon 08-09-25 20:41:21, Joseph Qi wrote:
> 
> 
> On 2025/9/8 18:23, Jan Kara wrote:
> > On Mon 08-09-25 09:51:36, Joseph Qi wrote:
> >> On 2025/9/5 00:22, Mateusz Guzik wrote:
> >>> On Thu, Sep 4, 2025 at 6:15 PM Mark Tinguely <mark.tinguely@...cle.com> wrote:
> >>>>
> >>>> On 9/4/25 10:42 AM, Mateusz Guzik wrote:
> >>>>> This postpones the writeout to ocfs2_evict_inode(), which I'm told is
> >>>>> fine (tm).
> >>>>>
> >>>>> The intent is to retire the I_WILL_FREE flag.
> >>>>>
> >>>>> Signed-off-by: Mateusz Guzik <mjguzik@...il.com>
> >>>>> ---
> >>>>>
> >>>>> ACHTUNG: only compile-time tested. Need an ocfs2 person to ack it.
> >>>>>
> >>>>> btw grep shows comments referencing ocfs2_drop_inode() which are already
> >>>>> stale on the stock kernel, I opted to not touch them.
> >>>>>
> >>>>> This ties into an effort to remove the I_WILL_FREE flag, unblocking
> >>>>> other work. If accepted would be probably best taken through vfs
> >>>>> branches with said work, see https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=vfs-6.18.inode.refcount.preliminaries__;!!ACWV5N9M2RV99hQ!OLwk8DVo7uvC-Pd6XVTiUCgP6MUDMKBMEyuV27h_yPGXOjaq078-kMdC9ILFoYQh-4WX93yb0nMfBDFFY_0$
> >>>>>
> >>>>>   fs/ocfs2/inode.c       | 23 ++---------------------
> >>>>>   fs/ocfs2/inode.h       |  1 -
> >>>>>   fs/ocfs2/ocfs2_trace.h |  2 --
> >>>>>   fs/ocfs2/super.c       |  2 +-
> >>>>>   4 files changed, 3 insertions(+), 25 deletions(-)
> >>>>>
> >>>>> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
> >>>>> index 6c4f78f473fb..5f4a2cbc505d 100644
> >>>>> --- a/fs/ocfs2/inode.c
> >>>>> +++ b/fs/ocfs2/inode.c
> >>>>> @@ -1290,6 +1290,8 @@ static void ocfs2_clear_inode(struct inode *inode)
> >>>>>
> >>>>>   void ocfs2_evict_inode(struct inode *inode)
> >>>>>   {
> >>>>> +     write_inode_now(inode, 1);
> >>>>> +
> >>>>>       if (!inode->i_nlink ||
> >>>>>           (OCFS2_I(inode)->ip_flags & OCFS2_INODE_MAYBE_ORPHANED)) {
> >>>>>               ocfs2_delete_inode(inode);
> >>>>> @@ -1299,27 +1301,6 @@ void ocfs2_evict_inode(struct inode *inode)
> >>>>>       ocfs2_clear_inode(inode);
> >>>>>   }
> >>>>>
> >>>>> -/* Called under inode_lock, with no more references on the
> >>>>> - * struct inode, so it's safe here to check the flags field
> >>>>> - * and to manipulate i_nlink without any other locks. */
> >>>>> -int ocfs2_drop_inode(struct inode *inode)
> >>>>> -{
> >>>>> -     struct ocfs2_inode_info *oi = OCFS2_I(inode);
> >>>>> -
> >>>>> -     trace_ocfs2_drop_inode((unsigned long long)oi->ip_blkno,
> >>>>> -                             inode->i_nlink, oi->ip_flags);
> >>>>> -
> >>>>> -     assert_spin_locked(&inode->i_lock);
> >>>>> -     inode->i_state |= I_WILL_FREE;
> >>>>> -     spin_unlock(&inode->i_lock);
> >>>>> -     write_inode_now(inode, 1);
> >>>>> -     spin_lock(&inode->i_lock);
> >>>>> -     WARN_ON(inode->i_state & I_NEW);
> >>>>> -     inode->i_state &= ~I_WILL_FREE;
> >>>>> -
> >>>>> -     return 1;
> >>>>> -}
> >>>>> -
> >>>>>   /*
> >>>>>    * This is called from our getattr.
> >>>>>    */
> >>>>> diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
> >>>>> index accf03d4765e..07bd838e7843 100644
> >>>>> --- a/fs/ocfs2/inode.h
> >>>>> +++ b/fs/ocfs2/inode.h
> >>>>> @@ -116,7 +116,6 @@ static inline struct ocfs2_caching_info *INODE_CACHE(struct inode *inode)
> >>>>>   }
> >>>>>
> >>>>>   void ocfs2_evict_inode(struct inode *inode);
> >>>>> -int ocfs2_drop_inode(struct inode *inode);
> >>>>>
> >>>>>   /* Flags for ocfs2_iget() */
> >>>>>   #define OCFS2_FI_FLAG_SYSFILE               0x1
> >>>>> diff --git a/fs/ocfs2/ocfs2_trace.h b/fs/ocfs2/ocfs2_trace.h
> >>>>> index 54ed1495de9a..4b32fb5658ad 100644
> >>>>> --- a/fs/ocfs2/ocfs2_trace.h
> >>>>> +++ b/fs/ocfs2/ocfs2_trace.h
> >>>>> @@ -1569,8 +1569,6 @@ DEFINE_OCFS2_ULL_ULL_UINT_EVENT(ocfs2_delete_inode);
> >>>>>
> >>>>>   DEFINE_OCFS2_ULL_UINT_EVENT(ocfs2_clear_inode);
> >>>>>
> >>>>> -DEFINE_OCFS2_ULL_UINT_UINT_EVENT(ocfs2_drop_inode);
> >>>>> -
> >>>>>   TRACE_EVENT(ocfs2_inode_revalidate,
> >>>>>       TP_PROTO(void *inode, unsigned long long ino,
> >>>>>                unsigned int flags),
> >>>>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> >>>>> index 53daa4482406..e4b0d25f4869 100644
> >>>>> --- a/fs/ocfs2/super.c
> >>>>> +++ b/fs/ocfs2/super.c
> >>>>> @@ -129,7 +129,7 @@ static const struct super_operations ocfs2_sops = {
> >>>>>       .statfs         = ocfs2_statfs,
> >>>>>       .alloc_inode    = ocfs2_alloc_inode,
> >>>>>       .free_inode     = ocfs2_free_inode,
> >>>>> -     .drop_inode     = ocfs2_drop_inode,
> >>>>> +     .drop_inode     = generic_delete_inode,
> >>>>>       .evict_inode    = ocfs2_evict_inode,
> >>>>>       .sync_fs        = ocfs2_sync_fs,
> >>>>>       .put_super      = ocfs2_put_super,
> >>>>
> >>>>
> >>>> I agree, fileystems should not use I_FREEING/I_WILL_FREE.
> >>>> Doing the sync write_inode_now() should be fine in ocfs_evict_inode().
> >>>>
> >>>> Question is ocfs_drop_inode. In commit 513e2dae9422:
> >>>>   ocfs2: flush inode data to disk and free inode when i_count becomes zero
> >>>> the return of 1 drops immediate to fix a memory caching issue.
> >>>> Shouldn't .drop_inode() still return 1?
> >>>
> >>> generic_delete_inode is a stub doing just that.
> >>>
> >> In case of "drop = 0", it may return directly without calling evict().
> >> This seems break the expectation of commit 513e2dae9422.
> > 
> > generic_delete_inode() always returns 1 so evict() will be called.
> > ocfs2_drop_inode() always returns 1 as well after 513e2dae9422. So I'm not
> > sure which case of "drop = 0" do you see...
> > 
> I don't see a real case, just in theory.
> As I described before, if we make sure write_inode_now() will be called
> in iput_final(), it would be fine.

I'm sorry but I still don't quite understand what you are proposing. If
->drop() returns 1, the filesystem wants to remove the inode from cache
(perhaps because it was deleted). Hence iput_final() doesn't bother with
writing out such inodes. This doesn't work well with ocfs2 wanting to
always drop inodes hence ocfs2 needs to write the inode itself in
ocfs2_evice_inode(). Perhaps you have some modification to iput_final() in
mind but I'm not sure how that would work so can you perhaps suggest a
patch if you think iput_final() should work differently? Thanks!

								Honza

-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ