[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1810121559460.133019@chino.kir.corp.google.com>
Date: Fri, 12 Oct 2018 16:09:54 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
cc: Christoph Lameter <cl@...ux.com>,
Pekka Enberg <penberg@...nel.org>,
Joonsoo Kim <iamjoonsoo.kim@....com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [patch] mm, slab: avoid high-order slab pages when it does not
reduce waste
On Fri, 12 Oct 2018, Andrew Morton wrote:
> > The slab allocator has a heuristic that checks whether the internal
> > fragmentation is satisfactory and, if not, increases cachep->gfporder to
> > try to improve this.
> >
> > If the amount of waste is the same at higher cachep->gfporder values,
> > there is no significant benefit to allocating higher order memory. There
> > will be fewer calls to the page allocator, but each call will require
> > zone->lock and finding the page of best fit from the per-zone free areas.
> >
> > Instead, it is better to allocate order-0 memory if possible so that pages
> > can be returned from the per-cpu pagesets (pcp).
> >
> > There are two reasons to prefer this over allocating high order memory:
> >
> > - allocating from the pcp lists does not require a per-zone lock, and
> >
> > - this reduces stranding of MIGRATE_UNMOVABLE pageblocks on pcp lists
> > that increases slab fragmentation across a zone.
>
> Confused. Higher-order slab pages never go through the pcp lists, do
> they?
Nope.
> I'd have thought that by tending to increase the amount of
> order-0 pages which are used by slab, such stranding would be
> *increased*?
>
These cpus have MIGRATE_UNMOVABLE pages on their pcp list. But because
they are order-1 instead of order-0, we take zone->lock and find the
smallest possible page in the zone's free area that is of sufficient size.
On low on memory situations, there are no pages of MIGRATE_UNMOVABLE
migratetype at any order in the free area. This calls
__rmqueue_fallback() that steals pageblocks, MIGRATE_RECLAIMABLE and then
MIGRATE_MOVABLE, and as MIGRATE_UNMOVABLE.
We rely on the pcp batch count to backfill MIGRATE_UNMOVABLE pages onto
the pcp list so we don't need to take zone->lock, and as a result of these
allocations being order-0 rather than order-1 we can then allocate from
these pages when such slab caches are expanded rather than stranding them.
We noticed this when the amount of memory wasted for TCPv6 was the same
for both order-0 and order-1 allocations (order-1 waste was two times the
order-0 waste). We had hundreds of cpus with pages on their
MIGRATE_UNMOVABLE pcp list, but while allocating order-1 memory it would
prefer to happily steal other pageblocks before calling reclaim and
draining pcp lists.
> > We are particularly interested in the second point to eliminate cases
> > where all other pages on a pageblock are movable (or free) and fallback to
> > pageblocks of other migratetypes from the per-zone free areas causes
> > high-order slab memory to be allocated from them rather than from free
> > MIGRATE_UNMOVABLE pages on the pcp.
> >
> > mm/slab.c | 15 +++++++++++++++
>
> Do slub and slob also suffer from this effect?
>
SLOB should not, SLUB will typically increase the order to improve
performance of the cpu cache; there's a drawback to changing out the cpu
cache that SLAB does not have. In the case that this patch is addressing,
there is no greater memory utilization from the allocted slab pages.
Powered by blists - more mailing lists