[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEXW_YTyw24aksUjgOcesEVHe5HjFVyVKCUpbf70yvqF13GrGA@mail.gmail.com>
Date: Thu, 23 Apr 2020 09:21:39 -0400
From: Joel Fernandes <joel@...lfernandes.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Amir Goldstein <amir73il@...il.com>, Jan Kara <jack@...e.cz>,
Linux FS Devel <linux-fsdevel@...r.kernel.org>
Subject: Re: [RFC] fs: Use slab constructor to initialize conn objects in fsnotify
On Thu, Apr 23, 2020 at 9:20 AM Joel Fernandes <joel@...lfernandes.org> wrote:
>
> On Thu, Apr 23, 2020 at 7:40 AM Matthew Wilcox <willy@...radead.org> wrote:
> >
> > On Thu, Apr 23, 2020 at 12:40:50AM -0400, Joel Fernandes (Google) wrote:
> > > While reading the famous slab paper [1], I noticed that the conn->lock
> > > spinlock and conn->list hlist in fsnotify code is being initialized
> > > during every object allocation. This seems a good fit for the
> > > constructor within the slab to take advantage of the slab design. Move
> > > the initializtion to that.
> >
> > The slab paper was written a number of years ago when CPU caches were
> > not as they are today. With this patch, every time you allocate a
> > new page, we dirty the entire page, and then the dirty cachelines will
> > gradually fall out of cache as the other objects on the page are not used
> > immediately. Then, when we actually use one of the objects on the page,
> > we bring those cachelines back in and dirty them again by initialising
> > 'type' and 'obj'. The two stores to initialise lock and list are almost
> > free when done in fsnotify_attach_connector_to_object(), but are costly
> > when done in a slab constructor.
>
> Thanks a lot for this reasoning. Basically, you're saying when a slab
> allocates a page, it would construct all objects which end up dirtying
> the entire page before the object is even allocated. That makes sense.
>
> There's one improvement (although probably verys small) that the paper mentions:
> Also according to the paper you referenced, the instruction cache is
Correcting myself, the paper wasn't referenced by you but by a
colleague :) Apologies for mistyping :)
Thanks,
- Joel
> what would also benefit. Those spinlock and hlist initialization
> instructions wouldn't cost L1 I-cache footprint for every allocation.
>
> > There are very few places where a slab constructor is justified with a
> > modern CPU. We've considered removing the functionality before.
>
> I see, thanks again for the insights.
>
> - Joel
>
> >
> > > @@ -479,8 +479,6 @@ static int fsnotify_attach_connector_to_object(fsnotify_connp_t *connp,
> > > conn = kmem_cache_alloc(fsnotify_mark_connector_cachep, GFP_KERNEL);
> > > if (!conn)
> > > return -ENOMEM;
> > > - spin_lock_init(&conn->lock);
> > > - INIT_HLIST_HEAD(&conn->list);
> > > conn->type = type;
> > > conn->obj = connp;
> > > /* Cache fsid of filesystem containing the object */
> > > --
> > > 2.26.1.301.g55bc3eb7cb9-goog
Powered by blists - more mailing lists