[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPjX3FeNcwQrBofsHppQpua7e7UwbwRAKKaz0KHS9ypEfntTDA@mail.gmail.com>
Date: Fri, 2 May 2025 13:23:15 +0200
From: Daniel Vacek <neelx@...e.com>
To: dsterba@...e.cz
Cc: Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>, David Sterba <dsterba@...e.com>,
linux-btrfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] btrfs: remove extent buffer's redundant `len` member field
On Fri, 2 May 2025 at 12:30, David Sterba <dsterba@...e.cz> wrote:
>
> On Wed, Apr 30, 2025 at 02:31:33PM +0200, Daniel Vacek wrote:
> > On Wed, 30 Apr 2025 at 10:21, Daniel Vacek <neelx@...e.com> wrote:
> > >
> > > On Wed, 30 Apr 2025 at 10:03, David Sterba <dsterba@...e.cz> wrote:
> > > >
> > > > On Tue, Apr 29, 2025 at 05:17:57PM +0200, Daniel Vacek wrote:
> > > > > Even super block nowadays uses nodesize for eb->len. This is since commits
> > > > >
> > > > > 551561c34663 ("btrfs: don't pass nodesize to __alloc_extent_buffer()")
> > > > > da17066c4047 ("btrfs: pull node/sector/stripe sizes out of root and into fs_info")
> > > > > ce3e69847e3e ("btrfs: sink parameter len to alloc_extent_buffer")
> > > > > a83fffb75d09 ("btrfs: sink blocksize parameter to btrfs_find_create_tree_block")
> > > > >
> > > > > With these the eb->len is not really useful anymore. Let's use the nodesize
> > > > > directly where applicable.
> > > >
> > > > I've had this patch in my local branch for some years from the times we
> > > > were optimizing extent buffer size. The size on release config is 240
> > > > bytes. The goal was to get it under 256 and keep it aligned.
> > > >
> > > > Removing eb->len does not change the structure size and leaves a hole
> > > >
> > > > struct extent_buffer {
> > > > u64 start; /* 0 8 */
> > > > - u32 len; /* 8 4 */
> > > > - u32 folio_size; /* 12 4 */
> > > > + u32 folio_size; /* 8 4 */
> > > > +
> > > > + /* XXX 4 bytes hole, try to pack */
> > > > +
> > > > long unsigned int bflags; /* 16 8 */
> > > > struct btrfs_fs_info * fs_info; /* 24 8 */
> > > > void * addr; /* 32 8 */
> > > > @@ -5554,8 +5556,8 @@ struct extent_buffer {
> > > > struct rw_semaphore lock; /* 72 40 */
> > > > struct folio * folios[16]; /* 112 128 */
> > > >
> > > > - /* size: 240, cachelines: 4, members: 14 */
> > > > - /* sum members: 238, holes: 1, sum holes: 2 */
> > > > + /* size: 240, cachelines: 4, members: 13 */
> > > > + /* sum members: 234, holes: 2, sum holes: 6 */
> > > > /* forced alignments: 1, forced holes: 1, sum forced holes: 2 */
> > > > /* last cacheline: 48 bytes */
> > > > } __attribute__((__aligned__(8)));
> > > >
> > > > The benefit of duplicating the length in each eb is that it's in the
> > > > same cacheline as the other members that are used for offset
> > > > calculations or bit manipulations.
> > > >
> > > > Going to the fs_info->nodesize may or may not hit a cache, also because
> > > > it needs to do 2 pointer dereferences, so from that perspective I think
> > > > it's making it worse.
> > >
> > > I was considering that. Since fs_info is shared for all ebs and other
> > > stuff like transactions, etc. I think the cache is hot most of the
> > > time and there will be hardly any performance difference observable.
> > > Though without benchmarks this is just a speculation (on both sides).
> > >
> > > > I don't think we need to do the optimization right now, but maybe in the
> > > > future if there's a need to add something to eb. Still we can use the
> > > > remaining 16 bytes up to 256 without making things worse.
> > >
> > > This really depends on configuration. On my laptop (Debian -rt kernel)
> > > the eb struct is actually 272 bytes as the rt_mutex is significantly
> > > heavier than raw spin lock. And -rt is a first class citizen nowadays,
> > > often used in Kubernetes deployments like 5G RAN telco, dpdk and such.
> > > I think it would be nice to slim the struct below 256 bytes even there
> > > if that's your aim.
> >
> > Eventually we can get there by using ushort for bflags and moving
> > log_index and folio_shift to fill the hole.
> > Let me know what you think.
>
> The bflags are atomic bits and this requires unsigned long. Also the
I realized and dropped that idea. Which means we can save just 8 bytes
on -rt and nothing on default config. Still a good win for free. I
mean in case we end up removing the `len` field.
> short int type is something we want to avoid because it's not a natural
> type on many architectures and generates worse code. I don't think we
> need to optimize for RT kernels, it's now part of mainline kernel but by
> far not a common configuration.
Powered by blists - more mailing lists