linux-kernel - Re: [PATCH 00/11] PageSlab: eliminate unnecessary compound

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YWbj6+myCposUKk1@cmpxchg.org>
Date:   Wed, 13 Oct 2021 09:49:31 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     linux-mm@...ck.org, Kent Overstreet <kent.overstreet@...il.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Vlastimil Babka <vbabka@...e.cz>,
        Michal Hocko <mhocko@...e.com>, Roman Gushchin <guro@...com>,
        linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH 00/11] PageSlab: eliminate unnecessary compound_head()
 calls

On Wed, Oct 13, 2021 at 04:19:18AM +0100, Matthew Wilcox wrote:
> On Tue, Oct 12, 2021 at 04:30:20PM -0400, Johannes Weiner wrote:
> > On Tue, Oct 12, 2021 at 08:23:26PM +0100, Matthew Wilcox wrote:
> > > On Tue, Oct 12, 2021 at 02:01:37PM -0400, Johannes Weiner wrote:
> > > > PageSlab() currently imposes a compound_head() call on all callsites
> > > > even though only very few situations encounter tailpages. This short
> > > > series bubbles tailpage resolution up to the few sites that need it,
> > > > and eliminates it everywhere else.
> > > > 
> > > > This is a stand-alone improvement. However, it's inspired by Willy's
> > > > patches to split struct slab from struct page. Those patches currently
> > > > resolve a slab object pointer to its struct slab as follows:
> > > > 
> > > > 	slab = virt_to_slab(p);		/* tailpage resolution */
> > > > 	if (slab_test_cache(slab)) {	/* slab or page alloc bypass? */
> > > > 		do_slab_stuff(slab);
> > > > 	} else {
> > > > 		page = (struct page *)slab;
> > > > 		do_page_stuff(page);
> > > > 	}
> > > > 
> > > > which makes struct slab an ambiguous type that needs to self-identify
> > > > with the slab_test_cache() test (which in turn relies on PG_slab in
> > > > the flags field shared between page and slab).
> > > > 
> > > > It would be preferable to do:
> > > > 
> > > > 	page = virt_to_head_page(p);	/* tailpage resolution */
> > > > 	if (PageSlab(page)) {		/* slab or page alloc bypass? */
> > > > 		slab = page_slab(page);
> > > > 		do_slab_stuff(slab);
> > > > 	} else {
> > > > 		do_page_stuff(page);
> > > > 	}
> > > > 
> > > > and leave the ambiguity and the need to self-identify with struct
> > > > page, so that struct slab is a strong and unambiguous type, and a
> > > > non-NULL struct slab encountered in the wild is always a valid object
> > > > without the need to check another dedicated flag for validity first.
> > > > 
> > > > However, because PageSlab() currently implies tailpage resolution,
> > > > writing the virt->slab lookup in the preferred way would add yet more
> > > > unnecessary compound_head() call to the hottest MM paths.
> > > > 
> > > > The page flag helpers should eventually all be weaned off of those
> > > > compound_head() calls for their unnecessary overhead alone. But this
> > > > one in particular is now getting in the way of writing code in the
> > > > preferred manner, and bleeding page ambiguity into the new types that
> > > > are supposed to eliminate specifically that. It's ripe for a cleanup.
> > > 
> > > So what I had in mind was more the patch at the end (which I now realise
> > > is missing the corresponding changes to __ClearPageSlab()).  There is,
> > > however, some weirdness with kfence, which appears to be abusing PageSlab
> > > by setting it on pages which are not slab pages???
> > > 
> > > 	page = virt_to_page(p);
> > > 	if (PageSlab(page)) {		/* slab or page alloc bypass? */
> > > 		slab = page_slab(page);	/* tail page resolution */
> > > 		do_slab_stuff(slab);
> > > 	} else {
> > > 		do_page_stuff(page); /* or possibly compound_head(page) */
> > > 	}
> > 
> > Can you elaborate why you think this would be better?
> > 
> > If the object is sitting in a tailpage, the flag test itself could
> > avoid the compound_head(), but at the end of the day it's the slab or
> > the headpage that needs to be operated on in the fastpaths, and we
> > need to do the compound_head() whether the flag is set or not.
> > 
> > I suppose it could make some debugging checks marginally cheaper?
> > 
> > But OTOH it comes at the cost of the flag setting and clearing loops
> > in the slab allocation path, even when debugging checks are disabled.
> > 
> > And it would further complicate the compound page model by introducing
> > another distinct flag handling scheme (would there be other users for
> > it?). The open-coded loops as a means to ensure flag integrity seem
> > error prone; but creating Set and Clear variants that encapsulate the
> > loops sounds like a move in the wrong direction, given the pain the
> > compound_head() alone in them has already created.
> 
> I see this as a move towards the dynamically allocated future.
> In that future, I think we'll do something like:
> 
> static inline struct slab *alloc_slab_pages(...)
> {
> 	struct page *page;
> 	struct slab *slab = kmalloc(sizeof(struct slab), gfp);
> 	if (!slab)
> 		return -ENOMEM;
> 	... init slab ...
> 	page = palloc(slab, MEM_TYPE_SLAB, node, gfp, order);
> 	if (!page) {
> 		kfree(slab);
> 		return -ENOMEM;
> 	}
> 	slab->virt = page_address(page);
> 
> 	return slab;
> }
> 
> where palloc() does something like:
> 
> 	unsigned long page_desc = mem_type | (unsigned long)mem_desc;
> 
> 	... allocates the struct pages ...
> 
> 	for (i = 0; i < (1 << order); i++)
> 		page[i] = page_desc;
> 	
> 	return page;
> }

Yeah having the page point to the full descriptor makes sense. That's
efficient. And it's very simple, conceptually...

> For today, testing PageSlab on the tail page helps the test proceed
> in parallel with the action.  Looking at slub's kfree() for an example:
> 
>         page = virt_to_head_page(x);
>         if (unlikely(!PageSlab(page))) {
>                 free_nonslab_page(page, object);
>                 return;
>         }
>         slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_);
> 
> Your proposal is certainly an improvement (since gcc doesn't know
> that compound_head(compound_head(x)) == compound_head(x)), but I
> think checking on the tail page is even better:
> 
> 	page = virt_to_page(x);
> 	if (unlikely(!PageSlab(page))) {
> 		free_nonslab_page(compound_head(page), object);
> 		return;
> 	}
> 	slab = page_slab(page);
> 	slab_free(slab->slab_cache, slab, object, NULL, 1, _RET_IP_);
> 
> The compound_head() parts can proceed in parallel with the check of
> PageSlab().
>
> As far as the cost of setting PageSlab, those cachelines are already
> dirty because we set compound_head on each of those pages already
> (or in the case of freeing, we're about to clear compound_head on
> each of those pages).

... but this is not. I think the performance gains from this would
have to be significant to justify complicating page flags further.