[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220714150000.hkq5d435rxdcz5jy@riteshh-domain>
Date: Thu, 14 Jul 2022 20:30:00 +0530
From: Ritesh Harjani <ritesh.list@...il.com>
To: Jan Kara <jack@...e.cz>
Cc: Ted Tso <tytso@....edu>, linux-ext4@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH 02/10] mbcache: Add functions to delete entry if unused
On 22/07/14 04:49PM, Jan Kara wrote:
> On Thu 14-07-22 17:45:32, Ritesh Harjani wrote:
> > On 22/07/12 12:54PM, Jan Kara wrote:
> > > Add function mb_cache_entry_delete_or_get() to delete mbcache entry if
> > > it is unused and also add a function to wait for entry to become unused
> > > - mb_cache_entry_wait_unused(). We do not share code between the two
> > > deleting function as one of them will go away soon.
> > >
> > > CC: stable@...r.kernel.org
> > > Fixes: 82939d7999df ("ext4: convert to mbcache2")
> > > Signed-off-by: Jan Kara <jack@...e.cz>
> > > ---
> > > fs/mbcache.c | 66 +++++++++++++++++++++++++++++++++++++++--
> > > include/linux/mbcache.h | 10 ++++++-
> > > 2 files changed, 73 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/fs/mbcache.c b/fs/mbcache.c
> > > index cfc28129fb6f..2010bc80a3f2 100644
> > > --- a/fs/mbcache.c
> > > +++ b/fs/mbcache.c
> > > @@ -11,7 +11,7 @@
> > > /*
> > > * Mbcache is a simple key-value store. Keys need not be unique, however
> > > * key-value pairs are expected to be unique (we use this fact in
> > > - * mb_cache_entry_delete()).
> > > + * mb_cache_entry_delete_or_get()).
> > > *
> > > * Ext2 and ext4 use this cache for deduplication of extended attribute blocks.
> > > * Ext4 also uses it for deduplication of xattr values stored in inodes.
> > > @@ -125,6 +125,19 @@ void __mb_cache_entry_free(struct mb_cache_entry *entry)
> > > }
> > > EXPORT_SYMBOL(__mb_cache_entry_free);
> > >
> > > +/*
> > > + * mb_cache_entry_wait_unused - wait to be the last user of the entry
> > > + *
> > > + * @entry - entry to work on
> > > + *
> > > + * Wait to be the last user of the entry.
> > > + */
> > > +void mb_cache_entry_wait_unused(struct mb_cache_entry *entry)
> > > +{
> > > + wait_var_event(&entry->e_refcnt, atomic_read(&entry->e_refcnt) <= 3);
> >
> > It's not very intuitive of why we check for refcnt <= 3.
> > A small note at top of this function might be helpful.
> > IIUC, it is because by default when anyone creates an entry we start with
> > a refcnt of 2 (in mb_cache_entry_create.
> > - Now when the user of the entry wants to delete this, it will try and call
> > mb_cache_entry_delete_or_get(). If during this function call it sees that the
> > refcnt is elevated more than 2, that means there is another user of this entry
> > currently active and hence we should wait before we remove this entry from the
> > cache. So it will take an extra refcnt and return.
> > - So then this caller will call mb_cache_entry_wait_unused() for the refcnt to
> > be <= 3, so that the entry can be deleted.
>
> Correct. I will add a comment as you suggest.
>
> > Quick qn -
> > So now is the design like, ext4_evict_ea_inode() will be waiting indefinitely
> > until the other user of this mb_cache entry releases the reference right?
>
> Correct. Similarly for ext4_xattr_release_block().
>
> > And that will not happen until,
> > - either the shrinker removes this entry from the cache during which we are
> > checking if the refcnt <= 3, then we call a wakeup event
>
> No, shrinker will not touch these entries with active users anymore.
>
> > - Or the user removes/deletes the xattr entry
>
> No. We hold reference to mbcache entry only while we are trying to reuse
> it. So functions ext4_xattr_block_cache_find() and
> ext4_xattr_inode_cache_find() will lookup potential mbcache entry that may
> have the same contents and get reference to it. Then we do comparisons
> verifying whether the contents really matches, if yes, we increment on-disk
> inode/block refcount. Then we drop mbcache entry reference which unblocks
> waiters in mb_cache_entry_wait_unused().
>
ohk, yes. This is where I was a bit confused.
Thanks for explaining it. This makes more sense. I did go through the mbcache
implementation, but I was missing the info on how the callers are using it.
-ritesh
> Honza
>
> --
> Jan Kara <jack@...e.com>
> SUSE Labs, CR
Powered by blists - more mailing lists